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Abstract 


la  seeking  rational  node Is  of  tine  series,  tke  concept  of  approximating 
second  order  statistical  relatioaskipa  (i.e. ,  tke  Ynle-falker  equations)  is 
often  explicitly  or  implicitly  invoked.  Tke  parameters  of  tke  kypotkesixed 
rational  model  are  typically  selected  so  tkat  these  relatioaskipa  'best 
represent'  a  set  of  antocorrelation  lag  estimates  computed  from  time  series 
observations.  One  of  tke  objectives  of  this  paper  vill  be  tkat  of 
estsblisking  tkis  fundamental  approaok  to  tke  generation  of  rational  models. 

An  examination  of  nany  popular  contemporary  spectral  estimation  methods 
reveals  tkat  tke  parameters  of  a  kypotkesixed  rational  model  are  estimated 
upon  using  a  'minimal'  set  of  Yule-' Talker  equation  evaluations.  Tkis  results 
in  an  undesired  parameter  hypersensitivity  and  a  subsequent  decrease  in 
estimation  performance.  To  counteract  tkis  parameter  kypersensitivity,  tke 
concept  of  using  more  tkan  tke  minimal  number  of  Yule-falker  equation 
evaluations  is  kerein  advocated.  It  ia  skovn  tkat  by  taking  tkis 
overdetermined  parametric  evaluation  approaok,  a  reduction  in  data  induced 
model  parameter  kypersensitivity  is  obtained,  and,  a  corresponding 
improvement  in  modeling  performance  results.  Moreover,  upon  adapting  a 
singular  value  decomposition  representation  of  an  extended  order 
autocorrelation  matrix  estimate  to  tkis  procedure,  a  desired  model  order 
determination  netkod  is  obtained  and  a  further  significant  improvement  in 
modeling  performance  is  achieved.  Tkis  approach  makes  possible  tke 
generation  of  low  order,  high  quality  rational  speotral  estimates  from  short 
data  lengths. 
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In _ Introduction 

In  i  variety  of  applications  suoh  as  found  in  radar  dopplor  processing, 
adaptive  filtering,  speech  prooeasiag,  underwater  aoouatios,  seismology, 
econometrics,  spectral  estimation  and  array  processing,  it  is  desired  to 
estimate  the  statistical  characteristics  of  a  wide- sense  stationary  time 
series.  More  often  than  not,  this  required  characterisation  is  embodied  in 
the  tine  series*  autocorrelation  las  sequence  as  specified  by 

*z(n)  *  8  (x(n+m)x(m)}  (1,1) 

in  which  8  and  -  denote  the  operations  of  expectation  and  complex 
conjugation,  respectively.  From  this  definition,  the  well-known  property 
thst  the  autocorrelation  lags  are  complex  conjugate  symmetric  (i.e.,  rx(-n)  « 
*x(n))  is  readily  established.  Ve  will  automatically  assume  this  property 
whenever  negative  lag  autocorrelation  elements  (or  their  estimates)  are 
required. 

The  second  order  statistical  characterisation  as  represented  by  the 
autocorrelation  sequence  may  be  given  an  'equivalent*  frequency  domain 
interpretation.  Namely,  upon  taking  the  Fourier  transform  of  the 
autocorrelation  sequence,  that  is 

m 

sx(ej*>  -  5  **(«)  e-J««  (1.2) 


we  obtain  the  associated  power  spectral  density  function  82(04**)  in  which  the 
normalised  frequency  variable  s  takes  on  values  in  [-w,*t].  This  function 
possesses  a  number  of  salient  properties  among  whioh  are  that  it  is  a 
positive  semidefinite,  symmetric  (if  the  time  series  is  real  valued),  and, 
periodic  funotion  of  m.  This  function  is  seen  to  have  a  Fourier  series 
interpretation  in  whioh  the  autocorrelation  lags  play  the  role  of  the  Fourier 
coefficients.  It  therefore  follows  that  these  coefficients  nay  be  determined 
from  the  power  spectral  density  funotion  through  the  Fourier  series 
coefficient  integral  expression 

ft 

•  ]_  I  Sz(eJ**)  .J-  dm  (1.3) 

In  J 
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Relationships  (1.2)  and  (1.3)  fora  a  Fourier  transfora  pair  so  that  knowledge 
of  the  ant oeor relation  sequence  is  equivalent  to  knowledge  of  the  power 
speetral  density  funotion  and  viee  versa.  Ve  belabor  this  point  in  order  to 
establish  the  viewpoint  that  speetral  estiaation  and  autocorrelation  lag 
estiaation  are  conceptually  equivalent. 

In  the  elassieal  speotral  estiaation  problea,  it  is  desired  to  effeot  an 
estiaate  of  the  underlying  power  speetral  denaity  function  with  this  estiaate 
being  based  on  only  a  finite  set  of  tine  series  observations.  Typically, 
these  observations  will  be  eoaposed  of  a  set  of  contiguous  data  aeasureaents 
taken  at  equispaoed  tiae  intervals  T  as  represented  by 

x(l),  x(2) .  .  .  ..  x(N)  (1.4) 

where  N  will  be  referred  to  as  the  data  length  and  we  have  chosen  to  suppress 
the  saapling  period  T.  It  is  apparent  that  unless  soae  constraints  are 
iaqposed  on  the  basic  nature  of  the  power  speotral  denaity  funotion.  there 
exists  a  fundamental  incoapatability  in  seeking  an  estiaate  of  the  infinite 
parameter  function  (1.2)  (i.e.,  the  infinite  set  of  autocorrelation  lags 

rz(n))  based  on  the  finite  set  of  observations  (1.4).  Investigators  have 
often  resolved  this  dileaaa  by  postulating  a  finite  parameter  aodel  for  the 
power  speetral  density  funotion.  The  tiae  series  observations  (1.4)  are  then 
used  to  fix  the  paraaeters  of  this  parametric  aodel  using  an  appropriate 
estiaation  procedure. 

Without  doubt,  the  aost  widely  used  and  studied  of  finite  paraaetrie 
aodels  are  the  so-called  rational  aodels.  When  employing  a  rational  aodel. 
we  are  seeking  to  approximate  the  generally  infinite  series  expansion  (1.2) 
by  a  magnitude  squared  ratio  of  polynomials  in  the  variable  e~J*.  that  is 

2 

b©  +  bje”J*  +  ...  +  bqe"JV» 

S(eJ“)  -  _  (1.5) 

1  +  a*  e"J*»  +  ...  +  Spe"JP* 

The  finite  nnaber  of  paraeters  in  this  aodel  then  provides  the  aechanisa  for 
circumventing  the  aforeaentioned  parameter  aisaatoh  dileaaa.  Naaely.  if  the 
data  length  parameter  N  adequately  exceeds  this  rational  function's  nuaber  of 
paraaeters  (i.e..  p+q+1),  then  it  is  feasible  to  utilise  the  given  tiae 
series  observations  (1.4)  to  estiaate  values  for  these  paraaeters.  A  few 
words  are  now  appropriate  concerning  the  adequacy  of 


rational  models  in  representing  power  apeotral  density  functions.  It  is  well 
known  that  if  a  power  spectral  denaity  function  is  continnona  in  the  variable 
«*»  then  it  nay  be  approximated  arbitrarily  closely  by  a  rational  function  of 
form  (l.S)  if  the  order  parameters  p  and  4  axe  seleoted  suitably  large  [41]. 
Comforted  by  this  knowledge,  rational  fnnetiona  have  beoome  a  standard  tool 
of  spectral  eatimation  theoretic! ana.  ha  an  intereating  side  note*  it  is 
ironical  that  the  origin  of  spectral  estiution  was  in  the  nee  of  rational 
modela  for  characterising  time  series  composed  of  sinusoids  in  white  noise. 
Members  of  this  olasa  of  time  series  possess  diacontinnous  power  speotral 
density  functions  and  are  therefore  presumably  not  representable  by  a 
rational  model.  As  we  will  see  in  Section  III.  however,  it  is  possible  to 
suitably  adapt  a  specif io  rational  model  so  as  to  satisfactorily  characterise 
this  olass  of  time  series. 

This  paper  is  primarily  concerned  with  developing  a  modeling  method 
whieh  utilises  an  overdetermined  set  of  statistical  equations  for  estimating 
a  rational  model's  parsmeters.  Vsing  thia  approach,  it  is  found  that  the 
resultant  modeling  performance  is  generally  better  than  that  achieved  by 
other  popularly  used  parametric  methods.  Although  the  approach  here  taken 
reflects  heavily  upon  the  author's  previous  works  [IS] -[22] .  much  of  this 
paper  will  be  concerned  with  formulating  many  contemporary  spectral 
estimation  methods  in  a  common  autocorrelation  representation  setting.  It 
must  be  emphasised  that  our  main  objective  is  not  that  of  giving  an 
encylopedio  ooverage  of  the  many  available  rational  speotral  estimation 
techniques.  This  paper  in  conjunction  with  the  excellent  reoent  publications 
[23] .[31] .[37] *  however,  provides  a  reasonable  complete  ooverage  of 
parametric  methods. 

In  the  remainder  of  this  section,  we  shall  consider  two  special  classes 
of  rational  functions  and  give  a  brief  historical  perspective  on  their  usage 
in  spectral  estiswtion  theory.  These  two  classes  are  commonly  referred  to  as 
the  moving  average  (MA)  and  the  autoregressive  (AR)  speotral  models.  A 
moving  average  model  is  defined  to  be  a  rational  funotion  (1.3)  in  whieh  all 
the  at  parameters  are  aero  (i.e.,  it  has  only  numerator  dynamics)  while  an 
autoregressive  model  is  one  for  whieh  all  the  b^  parameters  are  aero  except 
for  bQ  (i.e.,  it  has  only  denominator  dynamics).  By- in- large,  these  two 
olasaes  of  rational  functions  have  formed  the  basic  modeling  tools  in 
eontsmporary  spectral  estimation  theory. 
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mate! 

Fourier  analysis  has  played  a  primary  role  in  much  of  the  earlier  as 
veil  as  more  recent  efforts  at  speetrally  characterizing  experimentally 
collected  data.  As  an  example*  Schuster  applied  the  periodogram  method  for 
detecting  hidden  periodicities  in  son  spot  activity  data  at  the  turn  of  the 
century  [58] .  In  a  more  recent  classical  work*  Blackman  and  Tukay  presented 
a  generalized  procedure  for  effecting  spectral  estimates  [8] .  This  involved 
the  two  step  procedure  of  (i)  determining  autocorrelation  lag  estimates  rz(n) 
using  the  provided  data*  and*  (ii)  taking  the  Fourier  transform  of  these 
estimates.!  The  power  spectral  density  estimate  vhieh  arose  vhem  taking 
this  approach  then  took  the  form 

4 

Sx<ei<*)  -  J  w(n)  rz(n)  e"j«  (1.5) 

®— 4 

where  w(n)  is  a  symmetric  data  window  that  is  chosen  to  achieve  various 
desirable  effects  such  as  side  lobe  reduction.  This  window  is  often  selected 
to  be  rectangular  in  which  case  v(n)  *  1  although  other  choices  may  be  more 
desirable  for  a  given  application.  A  description  of  some  of  the  more  popular 
choices  for  the  data  window  may  be  found  in  numerous  texts  (e.g.*  see  refs. 
[33]«[50],[>57]). 

In  the  BlaekmanrTuksy  estimate  (1.5) ,  it  is  seen  that  only  a  finite 
number  of  summand  terms  (i.e.,  24+1)  are  involved  in  the  speetral  estimate. 
This  is  a  direct  conse4uence  of  the  fact  that  only  a  finite  set  of 
autocorrelation  lag  estimates  are  obtainable  from  the  observation  set  (1.4) 
if  standard  lag  estimation  methods  are  employed.  Due  to  this  finite  sum 
structure*  we  will  now  show  that  the  Blackman-Tukey  estimation  method  is  a 
special  case  of  the  more  general  rational  MA  spectral  model.  In  particular* 
a  speotral  model  is  said  to  be  a  movina  averaac  model  of  order  4  (i.e.. 
MA(4))  if  it  may  be  put  into  the  form 


!  Ve  shall  hereafter  use  the  'sret  v  cl  (a)  to  denote  a  statistical 
estimate. 
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S||A<eJ“)  -  I  be  +  bx  e“j»  + 


•  •  * 


(1.7) 


+  bq  I2 

-  I  Bq(eJ**)  |2 

The  q+1  paraaeters  bQ,  bj ,  bq  which  identify  this  MA(q)  nodal  are  aaan 

to  font  a  qtb  order  polynomial  Bq(«j*)  in  the  variable  e~j«*.  A  aoving 
average  nodal  ia  then  seen  to  be  a  special  ease  of  the  aore  general  rational 
nodal  (1.5)  in  which,  the  denoainator  polynoaial  has  been  set  equal  to  the 
constant  one. 

If  the  polynoaial  Bq(ejw)  constituting  the  aoving  average  aodel  (1.7)  is 
factored*  it  is  possible  to  provide  additional  insight  into  a  MA  aodel' s 
properties.  This  factorization  is  seen  to  give  rise  to  the  equivalent 
representation 

q 

sMA(«J“)  “  lb0 I 2  IT  (1  “  *kc“j“)(l  -  zkeJ“)  (1.8) 

k-1 

in  vhioh  the  zk  are  the  roots  of  the  polynoaial  Bq(eJM).  The  zeroes  of  a  MA 
spectral  aodel  are  seen  to  ocour  in  reciprocal  pairs.  Due  to  the  basic 
nature  of  this  faotorization,  aoving  average  aodels  are  therefore  also 
"vmonly  referred  to  as  all-zero  aodels.  If  any  of  the  roots  a*  are  close  to 
the  unit  circle  (i.e.,  it  is  dear  that  8111(01**)  will  oontain 

sharply  defined  notches  at  frequencies  in  a  neighborhood  asaoeiated  with 
these  roots  (i.e.,  •  ■  s^).  It  is  therefore  apparent  that  MA  aodels  will  be 
particularly  effective  when  approziaating  spectra  that  contain  sharply 
defined  notches  (zero  like  behavior)*  but*  do  not  oontain  sharply  defined 
peaks.  Whenever  a  speotrua  contains  sharply  defined  peaks*  it  is  possible  to 
siaulate  their  effeot  at  the  oost  of  aany  additional  zeroes  (i.e.*  a  high  MA 
order)  for  an  adequate  representation.  With  this  in  aind*  MA  aodels  should 
be  noraally  avoided  whenever  a  peaky  type  behavior  in  the  underlying  speotrua 
is  suspected  (as  aay  be  aade  evident  froa  a  preliainary  Blaekaan-Tukey 
estiaate) . 

To  establish  the  fact  that  the  Blaekaan-Tukey  approach  to  spectral 
eatiaation  is  of  a  aoving  average  structure*  it  is  possible  to  give  yet 
another  equivalent  representation  to  the  MA(q)  expression  (1.7) .  This  will 


entail  explicitly  carrying  oat  the  indicated  polynomial  product 
Bq(eJ**)  Bq(eJ“)  thereby  giving 

4 

Sjutej*)  -  ^  °n  e_jwn  (1.9) 

n*-q 


in  which  the  complex  conjugate  symmetric  cn  parameters  are  related  to  the 
original  bn  parameters  according  to 
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°n  “  ^  b*  bk-n  “4lai4  (1.10) 

k-0 


Upon  setting  the  cn  equal  to  w(n)rz(n) ,  it  is  apparent  that  the 
Blaokman-Tukey  estimate  (1.6)  is  a  special  form  MA(q)  model.  This  fact  is 
usually  overlooked  by  investigators  who  have  considered  the  Blackman-Tnkey 
method  as  well  as  the  periodogram  as  nonparametrie  spectral  estimators.  When 
viewed  from  the  approach  here  taken,  however,  each  of  these  procedures  is 
recognized  as  being  a  realization  of  a  HA  parametric  model. 

AR  Model 

When  we  compare  the  MA(q)  spectral  model  expression  (1.9)  with  the 
theoretical  power  spectral  density  function  (1.2)  which  is  being  estimated, 
it  is  apparent  that  a  serious  modeling  mismatch  can  arise  whenever  the 
underlying  autocorrelation  lags  are  such  that  the  rz(n)  are  not  approximately 
equal  to  zero  for  n  >  q.  For  example,  this  undesirable  condition  arises  when 
the  time  series  under  study  is  composed  of  sinusoids  in  white  noise. 
Conversely,  this  condition  does  not  arise  for  broad  band  signals.  The 
sinusoid  example  is  mentioned  since  it  forms  one  of  the  more  interesting 
special  case  time  series  to  which  speotral  estimation  techniques  are  applied. 
A  special  treatment  of  the  sinusoids  in  white  noise  case  will  be  given  in 
Section  III. 

In  recognition  of  this  potential  shortcoming  of  HA  models,  investigators 
have  examined  alternate  rational  speotral  models  which  do  not  invoke  the 
unnecessarily  harsh  requirement  of  a  truncated  autocorrelation  lag  behavior. 
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Undo abt ably,  the  most  widely  used  of  such  model c  is  the  AR  model.  Namely,  a 
speotral  model  is  said  to  be  as  autoregressive  model  of  order  p  (i.e.,  AR(p)) 
if  it  may  be  pnt  iato  the  form 

I* 

1  +  a1e-j«'+  a2e“J2m  +  ...  +  a-e-jP**!  (1.11) 

h>0l2 

lAp(ej«)|2 

This  AR(p)  model  has  a  functional  behavior  whieh  is  completely  charaeterized 
by  its  p+1  parameters  b0,  aj_,  aj,  ...»  Sp.  The  characteristic  order 
, polynomial  Ap(eJM)  is  seen  to  influence  the  frequency  behavior  of  the 
estimate  while  the  parameter  b0  controls  the  level. 

As  in  the  MA  model  oase,  valuable  insight  iato  the  capabilities  of  AR 
modeling  is  provided  upon  factoring  the  polynomial  Ap(eJM).  This  is  found  to 
result  in  the  equivalent  representation 

l*o  I2 

Sgatei*)  -  — - -  (1.12) 

7T  (i-pkb^m-pveJ*) 

k-1 

where  the  p^  are  the  roots  of  Ap(eJ*) .  The  poles  of  this  AR  speotral  model 
are  seen  to  occur  in  reciprocal  pairs.  For  reasons  whieh  are  self  evident, 
the  AR(p)  spectral  model  is  also  oommonly  referred  to  as  an  all-pole  model. 
As  such,  it  is  particularly  appropriate  for  modeling  spectra  whioh  contain 
sharply  defined  peaks  (pole  like  behavior),  but,  do  not  contain  sharply 
defined  notches.  If  a  spectrum  does  possess  notches,  however,  it  is  possible 
to  simulate  their  effect  at  the  cost  of  many  additional  poles  (i.e.,  a  high 
AR  order).  In  terms  of  parsaeter  parsimony,  it  is  therefore  prudent  to  avoid 
AR  models  whenever  motohes  in  the  underlying  spectrum  are  suspeoted  (this  may 
be  made  evident  from  a  preliminary  Blaekaan-Tukey  estimate) . 

Autoregressive  models  were  used  by  Yule  [66]  and  Walker  [63]  in 
forecasting  trends  of  economically  based  time  series.  These  models  were  then 
employed  by  Burg  [13]  in  1967  and  Parsen  [53]  in  1968  to  achieve  spectral 
estimates  which  did  not  possess  the  aforementioned  deficiencies  of  the  MA 
model.  The  Burg  method  is  of  particular  interest  since  it  offered  a  new 


StftUJ*)  - 
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insight  into  spsotrsl  aodeling  and  introduced  a  nnaber  of  coneepts  that  are 
now  standard  tools  of  speotral  estiaation.  This  includes  an  effieient 
lattice  structured  iapleaentation  of  the  Burg  nethod  which  has  ainoe  been 
exaained  and  advaneed  by  aany  investigators  (e.g.«  see  ref.  [44]).  It  is  not 
an  exaggeration  to  say  that  Burg's  aethod  gave  rise  to  a  literal  explosion  in 
research  activity  directed  towards  evolving  improved  rational  aodeling 
aethods. 

ABMA  Models 

In  aany  applications,  the  underlying  power  speotral  density  funetion 
will  oontsin  both  notch  and  peak  like  behavior.  As  such,  neither  the  HA  nor 
the  AK  aodel  is  the  nost  appropriate  aodel  representation  froa  a  par  awe  ter 
parsiaony  view  point.  The  aore  general  rational  aodel  (l.S).  however,  is 
capable  of  efficiently  representing  suoh  behavior.  This  aost  general 
rational  aodel  is  coaaonly  referred  to  as  an  autoregressive— aoving  average 
aodel  of  order  (p.q)  (i.e.,  ASMA  (p.q))  with  its  frequency  characterisation 
being  given  by 

2 

|b©  +  bie“J“  +  ...  +  bqe‘"J®**| 

SAUIhUi-)  -  |l  +  ale-J-  +  ...  ♦  1 

|Bq(ej“)|2 

“  lAp(ej«)|  (1.13) 

An  A&MA  aodel  is  seen  to  have  a  frequency  characterisation  which  is  the 
ooaposite  of  a  HA  and  an  AR  aodel.  To  further  reinforce  this  interpretation, 
we  have  the  following  equivalent  representation  upon  factoring  the 
polynomials  Ap(©j«)  md  Bq(eJw)  which  characterise  its  frequency  behavior 

q 

TT  (1  -  *k  •■■!“)  (1-Tkej“) 

SabmaCcJ**)  -  |b0l2  ii _ 

fj*  (1  -  Pi  e-j*)(l  -  Pk  eJ“)  (1.14) 

k-1 


An  ASMA  nodel  is  seen  to  possess  q  zeroes  and  p  poles,  end,  as  such  it  is 
generally  a  much  more  effective  nodel  titan  are  its  more  specialized  XA  (all 
zero)  and  AS  (all  pole)  nodel  counterparts.  These  poles  and  zeroes  are  seen 
to  occur  in  reciprocal  pairs. 

Althongh  ASHA  nodels  are  the  nost  preferable  choice  for  nost 
applications,  nary  practitioners  have  opted  to  utilize  either  KA  or  AS 
nodels.  There  is  an  increasing  awareness,  however,  of  the  general 
superiority  of  ASMA  nodeling.  This  has  given  rise  to  a  renewed  effort  to 
generate  eonpntationally  efficient  ASMA  nodeling  algorithns.  A  particularly 
effective  approach  to  ASMA  nodeling  will  be  presented  in  this  paper. 


II.  Rational  Modeling  -  Exact  Autocorrelation  Knowledge 

la  this  section,  the  theoretical  autocorrelation  charaoteristios  of  HA, 
AR  and  ARMA  randoa  processes  are  examined  separately.  This  characterisation 
will  in  tarn  enable  ns  to  intelligently  select  the  most  appropriate  rational 
model  which  best  represents  a  given  set  of  exact  autocorrelation  lags 

*x(0),  rx(l),  .  .  .,  rx(s)  (2.1) 

Moreover,  a  systematic  procedure  for  identifying  the  selected  model's 
par sme tors  from  these  given  autocorrelation  lag  values  is  also  developed. 
Although  the  assumption  here  made  of  exaot  ai tooor relation  information  is 
highly  idealistio  and  almost  never  met  in  applications,  the  insight  thereby 
provided  is  helpful  when  considering  the  more  practical  problem  of  generating 
rational  model  estimates  from  raw  time  series  observations. 

To  begin  this  analysis,  it  will  be  hereafter  assumed  that  the  time 
series  under  examination  is  generated  (or  can  be  adequately  modeled)  as  the 
response  associated  with  the  linear  operator 

P  4 

x(n)  +  J  *k  x(n-k)  -  J  bfc  e(n-k)  (2.2) 

k-1  k-0 

in  which  the  excitation  time  series  (e(n))  is  taken  to  be  a  sequence  of  aero 
mean,  unit  variance,  uneorrelated  random  variables  (i.e.,  normalised  white 
noise)  that  is  taken  to  be  unobservable.  This  excitation-response  behavior 
is  depioted  in  Figure  2.1.  Using  standard  techniques,  it  is  readily  shown 
that  the  power  speotral  density  function  associated  with  the  response  time 
series  is  given  by  the  AKMA(p,q)  rational  form 

2 

b©  +  bi  e”J"  +  ...  +  bq  e"J4«  | 

1  +  *j,  e“j«*  +  ...  +  tp  e"jP“  I 

Thus,  there  is  an  equivalency  between  an  assumed  ARMA  (p,q)  spectral  model, 
and,  the  response  of  the  recursive  linear  operator  (2.2)  to  white  noise.  In 
this  seotion,  the  required  rational  modeling  will  be  developed  through  use  of 
the  time  series  description  (2.2)  and  its  associated  autocorrelation 
characterisation.  It  is  interesting  to  note  that  most  available  rational 
spectral  estimation  techniques  are  based  upon  a  time  domain  characterisation. 


Sx(eJ")  - 
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s(n) 

Bq(eJ*) 

x(n) 

- - 

Vhite  Noise 

Response 

Excitation  Transfer  Function 

Figure  2.1.  Model  of  rational  tine  aeriea. 

The  aeohaaiaa  for  effeetiag  the  required  rational  node  ling  are  the 
so-called  Tule-Valker  equations  which  govern  linear  relationahip  (2.2) . 
Namely,  upon  multiplying  both  aides  of  this  relationahip  by  x(n-u)  and  then 
taking  expected  values,  it  ie  found  that  the  Tule-Valker  equations 

P  4 

^  a^rx (n— k)  ■  5  ^i  h(i-n)  (2.3) 

k-0  i«0 

arise  where  a0  -  1.  The  entity  h(n)  herein  used  corresponds  to  the 
unit-im  t  .  /,e  (i.e.,  Kronecker  delta)  response  of  linear  operator  (2.2)  .  This 
unit-ietpul sc  response  nay  also  be  interpreted  as  being  the  inverse  Fourier 
transform  of  the  linear  operator's  transfer  funetion  Bq(eJM)/hp(eJM) .  In 
what  is  to  follow,  it  will  be  assumed  that  this  linear  operator  is  causal 
thereby  inlying  that  h(n)  •  0  for  n  negative.  Although  this  assumption  is 
not  essential  in  the  analysis  which  follows,  it  is  here  imposed  in 
recognition  of  the  fact  that  most  applications  are  inherently  involved  with 
causal  operations.  Adaption  to  the  case  where  noneausal  operations  are  more 
appropriate  is  straightforward  and  will  not  be  given. 

The  Tule-Valker  equations  (2.3)  take  on  a  particularly  simple  form  when 
the  linear  operator  (2.2)  which  they  describe  is  constrained  to  be  a  MA  or  an 
A£  linear  operator.  To  delineate  this  fact,  we  shall  now  examine  separately 
the  basic  characteristics  of  the  Tule-Valker  equations  when  the  underlying 
linear  model  is  taken  to  be  MA,  AS,  and  ASMA. 
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MA  Tine  Series 

Ike  time  series  (x(n)}  is  said  to  be  s  moving  average  random  process  if 
it  is  generated  recording  to  the  linear  nonreonrsive  relationship 


4 

x(n)  -  ^  bk  s(n-k)  (2.4) 

k-0 


where  (s(n)}  is  the  aforementioned  normalized  white  noise  excitation  process. 
According  to  the  general  Yule-Walker  equations  (2.3),  the  response's 
autocorrelation  sequence  is  therefore  specified  by 


4 

5  bk  bk-n 
«.  k-O 

0 


-4  i  »  i  4 
otherwise 


(2.5) 


where  use  of  the  facts  that  a^  ■  0,  and,  h(n)  ■  bn  for  0  i  a  i  q  have  been 
incorporated.  Thus,  the  autocorrelation  sequenoe  associated  with  a  moving 
average  process  is  seen  to  be  of  finite  length  (i.e.,  2q+l)  with  the  length 
identifying  the  order  of  the  NA(q)  process. 

We  shall  now  consider  the  problem  of  identifying  the  HA  parsmeters  bk 
whioh  correspond  to  a  given  2q+l  length  autocorrelation  sequenoe  rx(n)  for 
-q£n£4.  This  identification  will  be  made  by  examining  the  spectral  density 
function  associated  with  the  autocorrelation  sequenoe.  In  particular,  upon 
taking  the  z-transforn  (in  lieu  of  the  Fourier  transform)  of  the  given  2q+l 
length  autocorrelation  sequenoe,  we  have  upon  using  relationship  (2.5) 

4 

Sx(z)  *  ^  rx(n)z"n 

w—  4 

4  4 

-  1 

n^-q  k-0 

4  4 

■  1  bk  I_k  $  b«  *"  (2.6) 

k-0  bfO 
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Since  the  finite  power  series  Sz(z)  has  ooaplex  conjngete  coefficients  (i.e., 
rx(-n)  ■  rx(n)),  it  follows  that  the  xeroea  of  this  power  series  auat  occur 
in  reciprocal  pairs.  With  this  in  aind,  it  is  therefore  always  possible  to 
factor  the  power  spectral  density  funetion  as 


Sx<*> 


( l-Xfcx”1 )  ( I-"?** ) 


(2.7) 


where  a  is  a  real  scalar.  Upon  coaparing  expressiona  (2.6)  and  (2.7),  it  is 
apparent  that 

4  4 

^  t>k  *“k  ■  •  |J  ( 1-Xfcz-1 )  (2.8) 

k-0  k-1 


Thus,  the  required  paraaeter  identification  is  achieved  by  carrying  out 
the  right  side  aultiplications  in  expression  (2.8)  and  then  equating 
coefficients  of  equal  powers  of  s'*.  The  aost  critical  step  of  this 
identification  procedure  is  the  factorization  of  the  known  power  series  Sz(z) 
as  given  in  equation  (2.7) . 

One  point  of  caution  should  be  raised  in  following  this  approach.  It 
arises  due  to  the  faot  that  although  the  factorisation  of  Sz(z)  into  its  2q 
first  order  produot  tens  is  unique,  the  deeoaposition  (2.7)  is  certainly 
not.  This  is  a  direct  consequenee  of  the  appearanoe  of  the  roots  of  Sz(x)  in 
reciprocal  pairs.  Thus,  the  ten  (1-xix-1)  nay  be  replaced  by  (l-zi^x"1)  in 
expression  (2.8)  without  destroying  the  required  structure  (2.6) .  This 
roplaceaent,  however,  will  in  general  lead  to  a  different  set  of  b* 
paraaeters.  Since  there  are  typioally  q  different  first  order  reciprocal 
pairs  in  the  factorisation  (2.7) ,  it  then  follows  that  there  are  2*  different 
bn  paraaeter  sets  whioh  are  ooapa table  with  the  autocorrelation  identity 
(2.5).  The  one  nonally  ohosen  corresponds  to  the  so-called  alniana  delay 
selection  in  whioh  the  x^  roots  used  in  expression  (2.8)  are  seleoted  so  that 
they  all  have  nagnitudes  less  than  or  equal  to  one. 
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AR  Time  Series 

Hie  time  scries  (x(n)}  is  said  to  bs  an  autoregressive  (AR)  prooaaa  of 
order  p  if  it  ia  generated  aeeordiag  to  tbs  recursive  relatiouabip 

9 

x(n)  +  ^  Sfc  x(u-k)  -  b0e(n)  (2.9) 

k-1 


vbare  (s(a)}  is  tba  aforementioned  normalised  white  aoise  prooaaa.  The 
Tul  e-Walker  equations  (2.3)  iadioata  that  tba  AR(p)  autocorral atioa  elements 
ara  ralatad  by 


P 

rx(u)  +  5  *k  rx(a-k)  « 

k-1 


nH> 

a*L 


(2.10) 


whara  use  of  tba  facts  that  b(0)  -  bQ  aad  b(a)  -  0  for  a  <  0  bava  baaa  aada. 

Ia  ordar  to  effact  a  direct  procedure  for  idaatifyiag  tba  AR(p)  aodal's 
P+1  parasiatars  a  sj .  . ...  Sp,  b0  which  bast  represaat  tba  sat  of 

autooorr  el  atioa  lag  values  (2.1) ,  oaa  aay  evaluate  tba  first  p+1  of  tba 
goveraiag  Tula-Walker  equations.  This  evaluation  vbea  put  into  a  matrix 
format  takes  tba  form 


*x(0)  rx(-l)  .  .  .  rx(-p) 

1 

"ibol2" 

rx(l)  rx(0)  .  .  .  rx(-p+l) 

*1 

0 

•  •  •  •  •  • 

•  •  •  •  •  a 

•2 

• 

m 

0 

• 

•  •  a  a  a  • 

•  •  erne  a 

»x(p)  rx(p-l)  ...  rx(0) 

• 

• 

_  *9- 

• 

• 

0 

(2.11a) 
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or  aott  ooapaotly  u 

(2.11b) 

la  this  expression.  1  is  the  (p+l)x(p+l)  AK  autocorrelation  astrix  whose 
elsaents  are  given  by 

KCi,j>  -  rz(i-j)  1  i  i  i  P+1  (2.12) 

liJiP+1 

^  is  the  (p+l)xl  sntoregressive  psrssister  vector  with  first  component  equal 
to  one.  that  is 

JL  “  [l**i .  aj,  ....  Spl  (2.13) 

end  fi  is  the  (p+l)xl  standard  basis  vector  whose  elesients  ere  sll  aero 
except  for  its  first  which  is  one.  The  required  psraaeter  identification  is 
then  obtained  upon  solving  this  systea  of  p+1  linear  equations  in  the  p+1 
unknowns.  Conceptually,  this  solution  asy  be  effeeted  by  perforaing  the 
following  eoaputation 

A-lbo^r-le!  (2.14) 

in  whioh  the  noraalixing  coefficient  b0  is  selected  so  that  the  first 
coaponent  of  i  is  one  as  required  in  expression  (2.13) .  In  this  solution 
procedure,  we  are  tacitly  assuaing  the  invertibility  of  the  autocorrelation 
aatrix  R.  If  aatrix  R  is  singular*  however,  this  alaost  always  iaplies  that 
the  underlying  tiae  series  is  an  autoregressive  process  of  order  less  than  p. 
In  this  case,  it  will  be  necessary  to  decrease  the  order  until  R  beeoaes 
invertible. 

Upon  exaaination  of  expression  (2.11),  it  is  seen  that  the  resultant 
AR(p)  aodel  paraaeters  are  totally  dependent  on  the  first  p+l  given 
autocorrelation  lags  rz(0) ,  rz(l) ,  ....  rz(p).  Although  the  associated  aodel 
will  have  an  autocorrelation  behavior  whioh  perfectly  aatohes  these  first  p+1 
lags,  it  asy  provide  a  very  poor  representation  for  the  rawaining  given 
autocorrelation  lags  rz(p+l),  rz(p+2),  ....  rz(s)  (whioh  were  not  used  in  the 
psraaeter  identification).  In  order  to  provide  s  represention  for  these 
higher  lags  by  the  procedure  here  taken,  it  any  be  necessary  to  increase  the 
AR  aodel  order  to  s  (i.e.,  p“s) .  In  aany  applications,  however,  the 
underlying  goal  will  be  that  of  providing  an  AR  aodel  of  relatively  low  order 
(i.e.,  p<<s)  which  will  adequately  represrnt  the  entire  set  of 
autocorrelation  lags.  A  procedure  for  achieving  this  objective  will  be 
shortly  given.  Before  considering  this  aost  relevant  objeotive,  let  us  first 
outline  an  elegant  aethod  for  solving  the  systea  of  linear  equations  (2.11) . 
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LEV1NSCN-DUKBIN  ALGORITHM:  Although  the  solution  procedure  u  eabodiod 
in  expression  (2*14)  will  result  in  tho  desired  paraaeter  identification*  tho 
evaluation  of  will  entail  on  the  order  of  p3  ml ti plication  and  addition 
oaloulationa  (i.e.,  o(p3))  if  standard  procedures  such  as  Gaussian 

eliainatioa  are  used.  Fortunately*  it  is  possible  to  take  advantage  of  the 
fact  that  the  autocorrelation  aatrix  I  is  both  oos^lex  conjugate  spurn  trie 
(i.e.,  K(i» j)  -  and  Toeplitx  (i.e.*  R( ^ « j )  -  R( i+i,j+l))  so  as  to 

effect  a  computationally  efficient  solution  prooedure.  This  aethod  was 
developed  by  Levinson  and  is  ooaaonly  referred  to  as  the  Levinsoa-Durbin 
algoritha  [24]  *[431.  In  this  approach,  one  solves  the  linear  systea  of 
equations  (2.11)  as  the  AR  order  paraaeter  p  is  sequenced  through  the  values 
1,  2*  3*  ...,  p^  where  pa  designates  soae  as  yet  unknown  aaxiaua  AM  order. 
In  this  sequencing  scheae*  Levinson  showed  that  the  paraaeters  for  the  kth 
order  AR  aodel  solution  which  axe  designated  by 


,  (k) 


(k) 


*l'"'»  *2'"' »  ...»  b0(k)  (2.15) 

are  related  to  the  (k-l)th  order  AR  aodel  solution  as  outlined  in  Table 
2.1.  A  brief  description  of  this  systeaatie  algoritha  will  now  be 
given. 


Step  1 

a1<1)  -  -rx(i)/rx(0) 
lb0(1>|2-  [1-  |.|/*>  |l]  rx(0) 

(2.16a) 

(2.16b) 

Step  2 

For  k*2*3,  4*  ... 

k-1 

ak(k)  »-  rx(k)  +  J  a.(k-1)  rx(k-u)  /lb0<k_1)|2 

(2.17a) 

a-1 

at<k>  -  ai^1*  +  ak(k)  ai-i1)  1  i  i  i  k-1 

(2.17b) 

lb0<k>|2  -  [1  -  |ak<k>|2l|b0<M>|2 

(2.17c) 

Table  2.1.  Levinsoa-Durbin  Algoritha  for 

Recursively  Solving  Expression  (2.11) 


m 


If  one  were  to  solve  the  linear  system  of  equations  (2.11)  for  the  order 
ohoiee  p  ■  1,  it  would  be  found  that  the  required  first  order  AS  pursue tors 
(with  superscript  (1)  appended)  are  given  in  step  1  of  Table  2.1.  Upon 
setting  p  ■  2  in  expression  (2.11).  a  noderate  amount  of  algebraio 
manipulation  will  reveal  the  validity  of  the  solution  as  given  in  Step  2  of 
Table  2.1  with  k  ■  2  (with  superscript  (2)  appended).  Levinson  proved  that 
in  following  the  systematic  procedure  of  Table  2.1.  the  solutions  to  the 
Yule-Walker  equation  (2.11)  for  order  selections 

P  *  1,  2.  3.  ...  are  sequentially  obtained.  Moreover,  the  number  of 

araltiplication  (and  addition)  computations  required  in  generating  the  k**  AS 
order  parameters  from  the  k-l**  AS  order  parameters  (i.e..  Step  2)  is  seen  to 
be  k.  Thus,  the  computational  complexity  of  the  Levinson-Durbin  algorithm 
for  generating  a  ptlx  order  AS  model  (and  all  lower  order  models  as  a 
byproduct)  is  found  to  be  o(p2) .  This  is  a  considerable  savings  over  the 
computational  complexity  of  o(p3)  required  in  solving  expression  (2.11)  using 
standard  techniques. 

The  Levinson-Durbin  algorithm  provides  not  only  a  computationally 

efficient  method  for  generating  the  AS  parameters,  but,  it  also  yields  an 

effective  AS  model  order  determination  procedure.  Specifically,  let  it  be 

assumed  that  the  autocorrelation  lags  used  in  expression  (2.11)  correspond  to 

an  AS(p)  process.  If  the  Levinson-Durbin  algorithm  were  applied  to  this 

autocorrelation  lag  information,  by  the  very  nature  of  this  procedure,  the  AS 

process  parameters  would  be  perfectly  identified  at  the  p*k  iteration  (i.e., 

aj/P*  -  ak  k  -  1,  2,  ....  p  and  |b0*p*|2  -  lb02l).  Moreover,  if  this 

recursion  were  continued  beyond  p,  it  would  be  found  that  a^  «  a^  for  1  £ 

i  i  p.  at(k)  -  0  for  p+1  i  i  i  k,  and,  |b0(k)l  -  |b0|.  This  is  a  direct 

consequence  of  the  fact  that  ap+i/F*1)  must  be  zero  as  is  evident  from 

expression  (2.17a).  From  these  observations,  it  is  therefore  apparent  that 

2 

the  nonchanging  of  the  parameters  |bQ(k)|  provides  a  means  for  order 
determination. 

When  the  autocorrelation  lags  used  do  not  correspond  to  an  AS  process, 

(k)  2 

there  will  be  SS.  v»l®e  of  k  for  which  |b0  |  assumes  a  constant  value 

thereafter.  Since  the  specific  high  order  coefficients  ak^  will  always 
have  a  magnitude  which  never  exceeds  one  [5]  and  [14],  however,  it  is 
apparent  from  expression  (2.17o)  that  |b0*k*|2  £  |b0*k”1*|2  for  all  k  11, 
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Thus,  the  paraaeters  lbe^|^  fora  a  aonotonicslly  nonincr easing  sequence  and 
this  factor  can  be  used  in  aodel  order  deteraination.  In  particular,  the 
peraaeter  |b0‘*' I*  aay  be  identified  with  a  'prediction  error*  associated 
with  a  kth  order  linear  predictor.  Once  this  prediction  error  beeoaes 
satisfactorily  saall,  the  associated  AR(p)  aodel  will  fora  an  acceptably  good 
aouroxiaation  to  the  given  autocorrelation  sequenee  (e.g.,  see  ref.  [31]). 
The  aeaning  of  'satisfactorily  saall'  is  subjective  and  will  depend  on  the 
particular  application  being  considered  and  eapirically  obtained  experience. 

The  paraaeters  a^(b)  for  k  ”  1,  2,  3,  ...  are  also  referred  to  as 
'reflection  coefficients'  and  are  often  denoted  by  c^  ■  afc(fc).  These 
reflection  coefficients  have  the  property  that  for  the  trunoated  sequence 
rz(0),  rz(l),  ....  rx(p)  to  be  a  valid  segaent  of  an  autocorrelation 
sequenee,  it  is  necessary  and  sufficient  that  |o^|  £  1  for  k  -  1,  2,  ...»  p. 
Moreover,  the  transfer  function 


Ap(x)  ■  ^  sn  z-n 


(2.18) 


associated  with  the  solution  to  expression  (2.11)  will  have  all  of  its  roots 
on  or  inside  the  unit  cirole  if  and  only  if  the  |o^|  £  l  for  k  ■  1,2,  ....  p. 

It  is  noteworthy  that  the  systea  of  equations  (2.11)  also  arise  when 
solving  the  optiaua  one-step  predietor  problea,  or,  when  using  the  aaxiaua 
entropy  principle  [31] ^  In  the  one-step  predictor  problea,  it  is  desired  to 
select  the  p  predictor  paraaeters  a^  so  that  the  prediction 


*(n)  -  -  J  *k  x(n-k) 


(2.19) 


best  approxiaates  x(n)  in  the  sense  of  ainiaixing  the  aean  squared  prediction 
error  B(|x(n)-x(n) |2] .  One  aay  readily  show  that  the  optiaua  prediction 
paraaeters  are  found  by  solving  expression  (2.11)  in  which  lbol  Pl*7* 
role  of  the  aininua  aean  squared  prediction  error.  On  the  other  hand,  when 
applying  the  aaxiaua  entropy  principle,  it  is  tacitly  assuaed  that  the  tiae 
series  (x(a)}  is  a  zero  aean,  Gaussian  process.  The  objective  is  to  then 
find  a  power  speotral  density  fuaetion  Sz(sJ*)  whioh  will  aaxiaize  the 
entropy  aeasure 
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It 

J  in  [Sz(«j«)  ]  dm 


(2.20) 


subjeot  to  the  constraint  that  this  function  will  be  consistent  with  the 
given  set  of  p*l  autocorrelation  lags  rz(0) ,  rz(l).  ....  rx(p)  through  the 
Fourier  transform  pair  relationship  (1.3).  It  is  readily  shown  that  the 
maximising  power  speotral  density  function  is  an  AK  process  of  order  p  whose 
parameters  are  giwen  by  expression  (2.11) . 

AMULIiftg-  Mil 

The  time  series  (x(n)}  is  said  to  be  an  autoregressive-mowing  awerage 
(AJLKA)  process  of  order  (p,q)  if  it  is  generated  (or  can  be  modeled) 
according  to  the  recursive  relationship 

P  q 

x(n)  +  ^  x(n-k)  -  ^  bfc  (2.21) 

k-1  k*0 

in  which  the  excitation  sequenee  (s(n)}  is  the  aforementioned  normalized 
white  noise  process.  Our  task  is  to  then  determine  values  for  the  a^  and  bfc 
parameters  of  this  model  which  are  most  compatable  with  the  given 
autocorrelation  lags  (2.1).  The  mechanism  for  measuring  this  oompatability 
will  be  the  Yule-Walker  equations  (2.3)  which  characterize  the  above  ARJiA 
model.  Upon  examination  of  these  equations,  it  is  seen  that  the  ARMA 
parameters  appear  in  a  nonlinear  fashion  through  the  unit- impulse  response 
h(n).  If  the  best  least  squares  modeling  is  desired,  it  is  then  found  that 
the  generation  of  the  optimal  a^,  bfc  parameters  involves  the  least  mean 
square  solution  of  the  highly  nonlinear  Yule-Walker  equations.  This  will 
almost  always  necessitate  the  use  of  computationally  burdensome  nonlinear 
programming  algorithms  with  the  attendant  difficulty  of  initial  parameter 
value  selection,  and,  the  possibilities  of  convergence  to  a  local  extrema  or 
even  nonconvergenoe • 

A  considerable  easing  in  computational  requirements  may  be  achieved  if 
we  allow  ourselves  the  luxury  of  evaluating  the  a^  and  b*  parameters 
separately.  By  using  this  approaoh,  it  will  be  possible  to  provide  for  a 
linear  solution  procedure  for  the  a^  parameters.  Although  this  approaoh  will 
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be  suboptisul  in  nature,  it  often  provides  for  a  near  optimal  node ling.  The 
mechanism  for  this  separate  parameter  evaluation  is  obtained  upon  examining 
the  Yule-Walker  equations  (2.3)  which  oharacterizes  the  ARMA  model  (2.21). 
If  this  model  is  take  to  be  causal,  it  follows  that  the  Yule-Walker  equations 
assume  a  particularly  simple  form  for  indices  n  >  q,  that  is 

P 

^  »k  rx(n-k)  ■  0  for  n  2  q+1  (2.22) 

k*K) 

We  shall  refer  to  this  particular  subset  of  the  Yule-Walker  equations  as  the 
extended  Yule-Walker  equations.  The  obvious  attractiveness  of  these 
equations  lies  in  the  fact  that  they  are  linear  in  the  a^  parameters. 

To  determine  the  a^  autoregressive  parsmeters  which  are  most  compa table 
with  the  given  set  of  autocorrelation  lags  (2.21),  we  could  adopt  the 
approach  that  characterized  extended  Yule-Walker  equation  AR  and  ARMA 
modeling  methods  up  to  as  recently  as  three  years  ago  (e.g.,  see  refs 
[26] ,[28] ,[35] ,[38] ) .  This  would  entail  evaluating  the  first  p  extended 
Yule-Walker  equations  (i.e.,  q+1  i  n  i  q+p)  and  then  solving  the  resultant 
system  of  p  linear  equations  in  the  p  auto-regressive  parameters.  Although 
this  approach  is  computationally  attractive,  it  suffers  from  the  obvious 
drawback  that  only  a  subset  of  the  given  autocorrelation  lags  (2.1)  are  being 
used  in  fixing  the  a*  parameters  (i.e.,  rz(n)  for  q-p  <  n  ±  q+p).  To  achieve 
a  ARMA  model  which  better  represents  the  entire  set  of  autocorrelation  lags 
(2.1),  it  is  clearly  beneficial  to  use  more  than  the  minimal  number  (i.e.,  p) 
of  extended  Yule-Walker  equation  evaluations.  The  a^  parameters  which  yield 
a  least  squares  fit  to  this  overdetermined  set  of  linear  equations  is  then 
found  using  a  straightforward  procedure  to  be  shortly  given. 

This  overdetermined  extended  Yule-Walker  equation  approach  to  ARMA 
speotral  estimation  was  proposed  by  the  author  in  1979  [15] .  From  a 
historical  perspective,  it  is  to  be  noted  that  the  idea  of  using  an  extended 
set  of  model  evaluations  forms  a  fundamental  concept  in  system  parameter 
estimation  theory  (e.g.,  see  refs.  [45], [59]).  Moreover,  the  approach  here 
taken  can  be  interpreted  as  being  a  generalized  application  of  the  Proay 
procedure  in  which  the  autocorrelation  lags  play  the  role  of  the  data.  With 
these  thoughts  in  mind,  there  exists  a  rioh  source  of  evidenoe  justifying  the 
use  of  an  overdetermined  set  of 
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extended  Yule-Walker  equations  for  estimating  the  ARKA  model's  autoregressive 
parameters. 

In  this  overdetermined  modeling  approach,  the  extended  Yul e-Walker 
equations  (2.22)  are  evaluated  for  t  distinct  values  of  n  satisfying  n  2.  q+1. 
To  effect  the  desired  overdeterminacy.  the  integer  t  has  to  be  selected  to  at 
least  equal  p+1  although  larger  values  will  typically  yield  better  model 
representations.  To  illustrate  this  overdetermined  approach,  let. us  consider 
the  first  t  extended  Tule-Walker  equations  (2.22)  indexed  by  q+1  ^  a  ^  q+t. 
This  particular  Yule-Walker  equation  evaluation  gives  rise  to  the  following 
overdetermined  system  of  t  linear  equations  in  the  p  autoregressive  parameter 
unknowns* 


rx(q+l)  rx(q)  rx(q-p+l) 

1 

—  — 

0 

»1 

rx(q*-2)  rx(q+l)  ...  ^(^5+2) 

*2 

• 

■1 

0 

•  •  •  •  •  '• 

• 

*P 

• 

•  •  •  •  •  • 

rx(q+t)  rx(q+t-l)  ...  rx(q-p+t) 

• 

0 

or  more  oompaotly  as 

*1  a  -  s.  (2.23b) 

In  this  latter  expression.  •  denotes  the  txl  zero  vector.  Rj  is  the  tx(p+l) 
ARKA  autocorrelation  matrix  with  Toeplits  type  structure  having  elements 

Rl(i.j)  -  rx(q+l+i-j)  1  i  i  i  t  (2.24) 

liJiP+1 


*In  oertain  applications,  it  may  be  desirable  to  use  an  other  than 
contiguous  set  of  extended  Yule-Walker  equation  evaluations. 
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and  &  is  the  (p+1)  autoregressive  paraaeter  vector  whose  first  coaponent  is 
required  to  be  one 

A  m  [1*  *1 »  ®2  *  •••»  *pJ  *  (2.2S) 

Exaai nation  of  relationships  (2.23)  reveals  that  the  ABMA  aodel's 
autoregressive  paraaeters  are  obtained  upon  solving  this  systea  of  t 
overdo terained  (assuaing  t  >  p)  linear  equationa.  Due  to  the  overdo temined 
nature  of  these  equation,  the  fundaaental  question  as  to  whether  a  solution 
exists  naturally  arises.  The  following  theorem  provides  an  answer  to  this 
question  and  is  a  direct  result  of  the  Tule-Valker  equations  whieh  governs 
ARMA  processes. 

Theorem  2.1:  If  the  autocorrelation  lag  entires  used  in  aatrix  Rj  of 
expression  (2.23)  correspond  to  those  of  an  ABMA  (pi,qj)  process,  then 
the  rank  of  *i  is  pi  provided  that  p  2.  pi,  q  2. 

With  this  theorem  in  aind,  the  existence  of  a  solution  to  relationship  (2.23) 
will  be  dependent  on  the  rank  of  the  autocorrelation  aatrix  E^.  We  shall  now 
consider  separately  the  oases  in  which  *i  has  full  rank  and  less  than  full 
rank. 


Bank  [Ex]  £  p:  When  the  rank  of  aatrix  Bi  has  less  than  full  rank,  a 
nontrival  autoregressive  paraaeteric  vector  solution  g  will  be  assured.  An 
interesting  algebraic  characterization  of  this  solution  nay  be  obtained  upon 
preaulti plying  both  sides  of  relationship  (2.23)  by  the  coaplex  eonjuage 
transpose  of  Ej  as  denoted  by  eJ  to  yield 

*1*1  a  -  Si  (2,26) 

Upon  exaaination  of  this  expression,  it  is  clear  that  the  required 
autoregressive  paraaeter  veotor  aay  be  also  identified  with  a  properly 
noraalized  eigenvector  (i.e.,  its  first  coaponent  is  one)  associated  with  a 
zero  eigenvalue  of  the  (p+l)x(p+l)  aatrix  As  suoh,  we  aay  then  use 

standard  eigenvector-eigenvalue  routines  when  finding  the  required  ABMA  aodel 
autoregressive  paraaeters. 

Bank  [EiJ  -  p+1:  In  aany  oases  of  interest,  however,  it  will  be  found 
that  the  autocorrelation  aatrix  *i  will  have  full  rank.  This  will  occur 
whenever  the  autocorrelation  lag  entries  used  are  associated  with  either  a 
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nonrational  random  tine  series,  sn  XA  process,  or,  with  s  higher  order  AUtA. 
rstioasl  process.  Sinoe  hss  foil  rank,  there  then  will  sot  exist  s 
nontrivial  solution  to  relationship  (2.23).  Nonetheless,  ve  still  wish  to 
determine  sn  ARMA  model  vhioh  'best  fits'  these  overdetermined  extended 
Yule-Valker  equations.  Namely,  ve  seek  a  nonzero  autoregressive  parameter 
veotor  a  so  that  Ri  a  most  closely  equals  the  required  ideal  zero  vector  as 
specified  in  (2.23).  Although  a  variety  of  procedures  may  be  used  for 
accomplishing  this  selection,  the  following  two  approaches  typify  many 
spectral  estimation  algorithms. 

(i)  In  the  first  selection  procedure,  it  is  desired  to  find  an 
autoregressive  parameter  vector  lying  on  the  unit  hypersphere  whioh  will 
minimize  the  Euclidean  norm  of  R-|  a.  This  entails  solving  the  following 
contrained  optimization  problem 

min  A^RjRjA 
i'l  -  1 


Using  standard  Lagrange  multiplier  concepts,  it  is  readily  shown  that  the 
solution  to  this  optisrixation  problem  is  obtained  by  selecting  that 
orthonormal  eigenvector  of  the  positive  definite  Hermetian  matrix  RjRj 
associated  with  its  minimum  eigenvalue.  If  x^  eorreaponds  to  that 
orthonormal  eigenvector  (i.e.,  tjljgk  -  XfcjLk  **th  *k  £  **+1  aad  “  1) » 

the  required  autoregressive  parameter  vector  with  first  component  of  one  if 
obtained  by  the  normalization. 

A°  -  -  H  (2.27) 

*l(l) 

where  xi(l)  denotes  the  first  component  of  n.  This  autoregressive  parameter 
veotor  selection  procedure  characterizes  many  spectral  algorithms  which  are 
varienta  of  the  Pisarenko  method  [55]  and  is  generally  not  suitable  for  an 
efficient  computational  solution. 

(ii)  In  the  second  selection  procedure,  we  wish  to  minimize  the 
Euclidean  norm  of  R^a  over  all  (p+l)xl  vectors  g  with  first  components  equal 
to  one,  that  is 
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■ia 

a(l)-l 

Appealing  to  the  Lagrange  multiplier  approach,  again,  it  ia  found  that  the 
solution  to  this  constrained  optimisation  problem  is  given  bp  solving  the 
following  linear  system  of  equations 

RjR^o  .  OAl  (2.28) 

where  the  normalizing  constant  a  is  selected  so  that  the  firat  component  of 
a<>  is  one'.’ 

In  using  either  of  the  above  two  procedures,  we  are  aeeking  to  beat 
satisfy  theoretioal  relationahip  (2.23)  in  the  least  squares  sense  subject  to 
appropriate  oonstraintsl .  The  particular  application  at  hand 

dictates  which  autoregressive  parameter  vector  selection  procedure  provides 
the  best  performance.  It  has  been  the  author's  experience  that  the  selection 
(2.27)  has  often  provided  reasonable  modeling  (also  see  ref.  [12]).  In  terms 
of  computational  efficiency,  however,  the  linear  selection  (2.28)  enjoys  a 
clear  superiority  due  to  the  availability  of  efficient  adaptive  algorithma  as 
outlined  in  Section  X.  With  this  in  mind,  we  shall  mainly  foous  our 
attention  on  the  linear  selection  (2.28) . 

In  summary,  the  ARMA(p,q)  model  associated  with  a  given  set  of 
autoregressive  lags  entails  an  examination  of  the  matrix  R^.  if  thia  matrix 
is  not  of  full  rank,  the  required  exact  autoregressive  parameter  vector  will 
be  given  by  solving  expression  (2.26) .  On  the  other  hand,  when  the  matrix 
has  full  rank,  an  appropriate  autoregressive  parameter  vector  may  be  aohieved 
by  solving  either  expression  (2.27)  or  (2.28).  It  is  important  to  appreciate 
the  fact  that  these  ARXA  results  are  applicable  to  the  speoial  AR  process  in 
which  case  we  simply  enter  q*0  when  forming  the  ARMA  autocorrelation  matrix 

*1- 


*It  is  possible  to  generalise  the  constraints  to  be  a  quadratic  surface 
(giving  rise  to  a  generalised  eigenvector  solution)  or  a  hyperplane, 
respectively  [10]. 
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Moving  Average  Paraaeters 

In  order  to  coaplete  the  ARMA  Modeling,  it  ia  necessary  to  dateline  the 
aodel's  assoeiated  aoving  average  paraaeters.  There  are  a  variety  of 
prooednres  for  aohieving  this  objeotive.  Te  shall  present  tvo  each 
procedures  of  vhioh  the  first  is  the  one  aoat  often  found  in  the  literature 
while  the  second  possesses  a  desirable  efficient  coaputatioual 
iapleaentation. 

(i)  In  the  first  procedure,  one  conceptually  applies  the  tiae  series 
(x(n)}  to  the  p**  order  noureeursive  filter  with  transfer  function  Ap(s) 
whose  coefficients  correspond  to  the  autoregressive  paraaeters  obtained  upon 
solving  either  expression  (2.20.  (2.27)  or  (2.28).  This  filtering  produces 
the  so-called  residual  tiae  series  as  specified  by 

P 

s(n)  -  J  *a*(n-a)  (2.29) 

vO 


This  filtering  causes  the  residual  tiae  series  to  be  a  aoving  average  process 
of  order  q  with  power  spectral  density  function  |Bq(e^")l2  as  is  nade  evident 
froa  Figure  2.2.  This  of  course  presunes  that  (x(n))  corresponds  to  an  ARMA 
processor  of  order  (p.q)  or  less.  A  siaple  analysis  indicates  that  the 
length  2q+l  autocorrelation  sequence  of  this  residual  tiae  series  aay  be 
coaputed  according  to 


rg(n) 


P  P 

5  5  *k»a  r»(a+»-k) 

k-0  nr-0 

0 


-qiaii 


otherwise 


(2.30) 


Using  these  MA(q)  autocorrelation  lags,  it  follows  froa  expression  (2.5)  that 
the  unknown  b*  paraaeters  aust  be  suoh  that 


<1 

r,(n)  ■  ^  b*  bk-a  “4i,a£q.  (2.31) 


A  spectral  factorization  along  the  lines  aentioned  in  this  section's  MA  tiae 
series  subsection  will  then  yield  the  desired  b*  paraaeters. 


s(n) 

Bq(z) 

x(n) 

^p(s) 

s(n) 

Ap(z) 

Figure  2.2.  Generation  of  reeidnal  tine  series. 


(ii)  If  computational  requirements  are  of  vital  concern,  the  technique 
to  be  now  outlined  is  particularly  efficient  [15], [16].  It  utilizes  the 
Fourier  tranafon  of  the  causal  part  of  the  autocorrelation  sequence 

D(e>)  -  J  rx(n)e“j«*  (2.32) 

n-1 


the  underlying  power  spectral  density  function  nay  be  directly  detemined 
fron  this  Fourier  transform  according  to 

Sx(eJ*»)  -  rx(0)  +  2Re{D(ej“)}  (2.33) 

A  conparison  of  this  expression  with  relationship  (1.13)  reveals  that  the 
transforn  D(eJ“)  nust  be  of  the  font 

cje-J^+cje-]^  +  ...  +  c0e“iF“ 

D(eJ*)  -  - 

l+«ie-j“  +  ...  +  ape“jF“ 


C(ejn) 

Ap(eJ*) 


(2.34) 


where  we  are  tacitly  assuming  that  the  moving  average  order  is  not  larger 
than  the  autoregressive  order  (i.e.,  q  £  p). 

To  deternine  the  required  cn  coefficients  in  expression  (2.34),  we  will 
first  compute  the  first  s  inpulse  response  elements  of  the  filter  H(eJtt)  “ 
1/Ap(«jw).  This  will  entail  using  the  following  relationship 

P 

h(n)  -  “  J  akh(n-k)  1  i  n  i  s  (2.35) 

k-1 
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in  thicli  h(0)  *1  and  h(n)  ■  0  for  n  <  0  are  used  to  initiate  the  reeuraion, 
Ve  next  nee  the  tine  domain  equivalency  of  relationahip  (2.34)  to  conclude 
that 


’h(O)  0  0  ...  0 

«1 

rz(l) 

h(l)  h(0)  0  ...  0 

•  • 

°2 

rz(2) 

• 

•  • 

•  • 

h(p-l)  h(p— 2)  .  .  .  .  h(0) 

•  • 

• 

_CP_ 

” 

• 

• 

• 

•  • 

•  • 

h(s)  h(s-l)  ....  h( s-p+1 ) 

*-  _ 

rx(s) 

(2.36a) 


or 


Hi«l 


(2.36b) 


In  general,  the  overdeternined  system  of  equations  (2.36)  will  not  have 
a  solution  unless  the  autooorrelation  elements  rz(a)  are  associated  with  an 
ARMA  process  of  order  (p.p)  or  lover.  Assuming  this  not  to  be  the  case,  ve 
could  select  the  vector  £  so  as  to  provide  a  least  squares  solution  to 
expression  (2*36) .  This  vould  take  the  form  of  solving  the  consistent  system 
of  linear  equations 

£  -  [H«ll]-lll*£  (2 .37) 

In  order  to  aohieve  the  aforementioned  efficient  computational 
algorithm,  the  parameter  s  may  be  taken  to  be  p  vhioh  renders  the  following 
straightforward  method  for  evaluating  the  cB 

a-1 

•n  ■  J  akrx(n-k)  1  i  n  £  p  (2.38) 

k-0 

This  is  basically  the  approach  taken  in  references  [IS]  and  [16] .  In  using 
expression  (2.38)  for  evaluating  the  oa.  ve  are  trading  off  performance  for 
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computational  efficiency.  It  lias  been  the  author 'a  experience  that  the 
apeotral  estimates  achieved  upon  using  the  least  squares  fit  (2.37)  do  not 
typioally  provide  a  superior  performance  to  those  given  by  the  simpler 
relationship  (2.38).  In  any  oase.  once  the  en  parameters  have  been 
determined,  the  Fourier  transform  (2.34)  is  used  in  expression  (2.33)  to 
effeot  the  required  power  spectral  density  model.  Moreover,  if  it  is  desired 
to  evaluate  the  b^  parameters,  we  can  use  the  identity 

|Bq(«J*»)|2  -  Ap(oj*)  C(eJ*)+  Sp(oj-)C(eJ**)+rx(0)|Ap(ej‘*)U  (2.39) 

and  a  spectral  factorisation  to  achieve  this  objective. 

In  this  section,  we  have  outlined  convenient  procedures  for  generating 
IIA.  AS  and  ARMA  spectral  models  when  perfect  autocorrelation  lag  information 
is  available.  The  principle  steps  of  these  procedures  are  summarized  in 
Table  2.2.  Although  these  results  are  of  primarily  theoretical  interest,  we 
will  subsequently  adapt  them  to  evolve  effective  rational  spectral  estimation 
methods  for  the  more  practical  case  where  only  raw  time  series  observations 
are  used  in  the  modeling. 
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MA  Model 

4 

Sx(eJ**)  -  2  w(n)  * 

•x(n)  e"J“® 

(1.6) 

*— 4 

AR  Model 

(i) 

Form  the  (p+l)x(p+l)  AS  antooorrelation  matrix  S  using 
(2.12) 

expression 

(ii) 

Solus  Sjl  -  |bc|2 

(2.11) 

where  parameter  bQ  is  seleeted  so  that  the 
one. 

first  component  of  g  is 

(iii) 

bo 

2 

1+U!  •“>+...  +  ape^** 

ARMA  Model 

(i) 

Form  the  tx(p+l) 
(2.24) 

ARMA  autocorrelation  matrix  Rg  using 

expression 

(ii) 

(a)  If  Rank  (Ri*Rg)  <  p+1  then  solve 

*1**1  »  - 

• 

(2.23) 

(b)  If  Rank  (Rj^Rj)  -  p+1  then  either  solve 

*1**1  1  “ 

a  fii 

(2.28) 

where  a  is  seleeted  so  that  first  component  of  jg  is 
or 

use  the  minimum  eigenvalue-eigenvector  yielding 
(2.27) . 

one. 

selection 

P  P 

(iii) 

r,(n)  -  ^  ^ 

•kT.  rx(n+m-k)  0£n£q 

(2.32) 

k-O  u-0 

4 

5  *,(n)e"J,“ 

(iv) 

Sx(eJ“)  -  _ 

nF-q 

a^-J-  +  ...  +  ape"-!**  |* 

Table  2.2.  Rational  apeotral  model  techniques  employing 
exact  autocorrelation  lag  information. 
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III.  Sinusoids  in  White  Noise  Bxsmple 

He  procedure  ss  developed  in  the  preceding  section  is  sppliosble  to  the 
task  of  generating  rational  nodels  for  the  general  class  of  wide-sense 
stationary  tins  series.  In  order  to  demonstrate  the  relative  effectiveness 
of  HA,  AR,  and  ASHA  modeling,  the  olassioal  problem  of  the  detection  and 
frequency  identification  of  the  sinusoids  in  white  noise  case  will  now  be 
considered.  Although  this  does  represent  a  very  narrow  application  of 
rational  spectral  estimation  techniques*  it  provides  a  meaningful  basis  for 
understanding  the  relative  performance  capabilities  of  HA*  AR,  ARNA  models. 
In  particular*  the  time  series  being  now  exsmined  is  taken  to  be  oomposed  of 
the  sum  of  m  real  sinusoids  in  additive  noise  as  specified  by 

m 

x(n)  -  ^  Afc  sin  [2fff^n  +  e^]  +  w(n)  (3.1) 

k-1 

in  which  the  e^  are  independent*  uniformly  distributed  random  variables  on 
the  interval  [-u*n]  and  w(n)  is  a  xero  mean,  variance  c2  white  noise  prooess. 
It  is  recalled  that  the  problem  of  detecting  sinusoids  in  noise  originally 
gave  rise  to  spcotrsl  estimation  theory.  The  periodogram  method  was 
developed  for  this  very  purpose  by  Schuster  in  1898  [58]  . 

The  task  at  hand  is  to  generate  XA,  AR,  and,  ARRA  models  from  the 
autocorrelation  values  associated  with  this  time  series  using  the  procedures 
outlined  in  the  previous  section.  It  is  a  simple  matter  to  show  that  the 
autocorrelation  sequence  characterising  time  series  (3.1)  is  given  by 
m 

rx(n)  -  ^  0.5  A*  cos  +  0*61 n)  (3.2) 

k-1 

in  whioh  8(a)  denotes  the  unit- inpul e  (Krone cker  delta)  sequeaee.  The  power 

speotral  density  function  associated  with  this  process  is  composed  of  2m 

2 

dirae  delta  impulses  of  snplitude  0.5  A^  located  at  frequencies  +f*  riding  on 
top  of  a  constant  value  c2.  As  such,  this  discontinuous  power  speotral 
density  fumotioa  may  not  be  associated  with  a  finite  order  KA*  AR,  or  ARNA 
process. 
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Although  the  autocorrelation  sequence  (3.2)  is  not  eoapatable  with  a 
finite  order  ABMA  aodel,  it  is  readily  shown  that  this  sequence  will  satisfy 
the  following  homogeneous  relationships 

2a 

^  *k  rx(n-k)  ■  0  for  n  >  2a  (3.3) 

k-0 

where  a0  •  1.  The  ak  paraaeters  required  in  this  expression  are  obtained  by 
equating  coefficients  of  the  following  polynoaial  equivalency 

2a 

^2a(  *)  ■  J  an  z-n 
n-0 

a 

-  TT  [1“2  *“1  cos(2irfk)  +  z-2j  (3.4) 

k-1 

where  the  zeroes  of  this  polynoaial  (i.e..  e+j2xfk)  lz(  identified  with  the 
frequencies  of  the  tiae  series'  sinusoids  (e.g.,  see  refs.  [101 ,[32] ,155] ) . 

Upon  coapari son  of  relationships  (3.3)  and  (2.22),  it  aight  be 
incorrectly  infered  that  the  autocorrelation  sequence  (3.2)  would  be 
associated  with  an  ARMA  process  of  order  (2a,2a).  Upon  exaaination  of  the 
Tule-Valker  equations  for  indioies  0  £  n  £  2a,  however,  it  will  be  found  that 
an  exact  correspondence  does  not  result.  This  siaply  reflects  the  fact  that 
the  tiae  series  (3.1)  does  not  arise  froa  exciting  a  linear  ARMA  operator 
with  white  noise.  Nonetheless,  due  to  the  identical  foras  of  equations 
(2.22)  and  (3.3),  we  aay  still  use  the  ARMA  aodeling  autoregressive  parsaeter 
procedure  as  outlined  in  Section  II  to  identify  the  2a  paraaeters  ak.  These 
paraaeters  would  be  then  in  turn  inserted  into  relationship  (3.4)  to  identify 
the  frequency  paraaeters  fk  upon  factorization  of  the  polynoaial  AjaU). 
This  spectral  behavior  can  be  conveniently  displayed  in  a  plot  of 
il/A2a(eJ<*)  I  versus  a. 

Once  the  fk  frequency  paraaeters  have  been  determined,  the  associated  Ak 
aaplitude  paraaeters  aay  be  obtained  upon  evaluating  expression  (3.2)  over 
any  set  of  a  or  aore  indices  satisfying  n  2  1.  With  this  in  aind,  let  us 
evaluate  this  expression  for  the  contiguous  indices  1  i  n  £  v  where  the 
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integer  ▼  1  a.  This  is  found  to  yield  the  following  overdo  terained  (if  ▼>*) 
system  of  consistent  linear  equations  in  the  unknowns 


tx(  1) 


eoa(2»fi)  oos(2*f2> 


oos(2nfM) 


*x(2) 


cos(4rfi)  oos(4*f2) 


oos(4xfa) 


(3.5) 


L5*(v>  J.  jeos(2vnfi)  cos(2v«f2> 

or  equivalently  as 


cos(2»vfa) 


£  «  Cp  (3.4) 

2 

where  p  is  the  so-called  sxl  power  vector  with  elements  A^/j.  the  integer 
parameter  v  is  selected  to  be  larger  than  or  equal  to  a.  the  least  square 
approximate  solution  to  the  overdeteraining  equations  (3.5)  is  given  by 

p  *  (C'Cr1  C'  r  (3.7) 

where  C*  designates  the  transpose  of  natrix  C.  In  the  case  of  perfect 
autocorrelation  knowledge,  we  normally  set  v  “  a  thereby  giving  the  solution 
£  m  C"1^.  In  the  a  ore  practical  case  in  which  only  raw  time  series 
observations  are  given  for  the  estiamte,  however,  a  desirable  degree  of 
paraaeter  saoothing  is  achieved  by  selecting  v  >  a. 

Although  the  sinusoids  in  white  noise  tiae  series  (3.1)  is  not 
computable  with  an  AR  nodel,  AR  models  have  also  been  successfully  employed 
in  analyzing  such  tiae  series.  Depending  on  the  underlying  signal  to  noise 
ratios 

likis 

2«* 

the  desired  detection  and  frequency  estiaation  will  require  that  the  AR  order 
paraaeter  p  be  aade  sianificatlv  larger  than  2a.  Variants  of  the  Pisarenko 
method  (551,  and,  the  SVD  approaoh  of  Tufts  and  Kuaaresan  [42], [51]  typically 
produce  satisfactory  performance  on  the  sinusoids  in  white  noise  oase.  As  we 
will  illustrate  in  Section  VIII,  the  approach  taken  in  this  paper  will  also 


produce  exceptional  performance  when  an  SVD  adaption  of  the  ARJtA  modeling 
method  herein  presented  is  made. 


Alternate  Method 

It  is  possible  to  apply  the  concept  of  using  an  overdetermined  system  of 
model  evaluations  for  achieving  high  quality  alternative  estimates  for  the 
frequency  parameters  appearing  in  expression  (3.1).  This  will  make  use  of 
the  observation  that  homogeneous  relationship  (3.3)  holds  for  all  values  of  n 
provided  that  there  is  no  white  noise  present  (i.e.,  <j2  -  0)  .  Under  this 
restriction,  an  evaluation  of  expression  (3.2)  with  <y2  »  0  over  the  indices 
-t  +  2p  £  n  £  t  (in  which  p  ■  2m)  is  found  to  result  in  the  following 
symmetrical  relationship 


*x(”t+p)  rx(-t+p-l)  .  .  .  rx(-t) 

1 

0 

*x(-t+p+l)  rx(“t+p)  .  .  .  rx(-t+l) 

*1 

0 

.  •  • 

• 

• 

• 

m 

•  •  . 

• 

• 

•  •  . 

• 

_»P  _ 

e 

• 

*x(t)  rx(t-l)  .  .  .  rx(t-p) 

0 

or 

*s  A  -  «  (3.8b) 

in  which  t  is  selected  so  that  t  >  3m/2  thereby  ensuring  an  overdetermined 
system  of  homogeneous  relationships. 

If  the  autocorrelation  lag  entries  of  expression  (3.8)  correspond  to 
(3.2)  with  o2  «  0.  it  then  follows  that  the  overdetermined  system  of 
equations  (3.8)  will  have  a  unique  solution  for  the  a^  coefficients.  This 
solution  oan  then  be  incorporated  into  equation  (3.4)  to  obtain  estimates  for 
the  frequency  f*  parameters.  In  the  additive  noise  ease  +  0,  however, 
this  system  of  equations  will  generally  not  have  a  solution.  Sinoe  the  e* 
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ten  appears  in  only  p+1  out  of  the  (2t-p+l)x(p+l)  entries  of  aatrix  1, 
(i.e.,  the  rz(0)  entries),  it  can  be  argned  that  so  long  as  t>>p,  the  effect 
of  the  additive  noise  will  be  ainiaal.  Based  on  this  praaise,  it  is  natural 
to  then  seek  a  vector  &  such  that  this  inconsistent  systea  of  linear 
equations  is  best  satisfied  in  a  least  squares  sense.  The  required  least 
squares  solution  is  then  given  by  solving  the  systea  of  equations 

»,*  I  »,  i  -  a  11  (3.9) 

in  which  a  is  a  nonalizing  soalar  selected  to  ensure  that  the  first 
coaponent  of  £  is  one.  The  nonnegative  diagonal  aatrix  V  is  typically 
selected  to  be  equal  to  the  identity  aatrix.  As  we  will  see  in  Section  VIII, 
the  solutions  obtained  by  using  expression  (3.9)  often  provide  exceptional 
estiaates  so  long  as  t>>p.  A  paper  in  preparation  will  further  refine  this 
new  approach. 


Numerical  Ex ancle 

In  order  to  illustrate  the  effectiveness  of  the  three  rational  aodels  in 
resolving  sinusoids  eabedded  in  white  noise,  we  shall  now  consider  the 
specific  tine  series 

x(n)  ■  sin(0.4im)  +  sin(0.43im)  +  w(n)  (3.10) 

The  white  noise  series  (w(n)}  will  be  taken  to  have  a  variance  of  0.5  thereby 
creating  a  zero  dB  signal-to-noise  ratio  (SNR)  environment.  According  to 
relationship  (3.2),  the  autocorrelation  sequence  associated  with  this  tiae 
series  is  specified  by 

rz(n)  *  0.5  eos(0.4im)  +  0.5  cos(0.43im)  +  0.56(n)  (3.11) 

Ve  shall  now  use  these  autocorrelation  lags  along  with  the  concepts  developed 
in  Section  II  to  generate  appropriate  MA,  AR  and  ARMA  aodels.  A  brief 
discussion  of  the  resultant  aodeling  performances  in  this  idealistic 
situation  will  now  be  given. 

HA  Models:  When  using  the  classical  spectral  aodeling  expression 

<1 

Sz(eJ«*)  -  J  r*(a)  •~j“a  (3.12) 

nF-q 
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we  are  ia  effect  invoking  a  MA(  q)  aodel.  Plota  of  this  expression  with 
entries  (3.11)  for  aodel  order  selections  of  q  ■  32  and  q  *  64  are  shown  in 
Figure  3.1  over  the  range  of  noraalized  frequencies  0  i  f  £0.5.  Froa  these 
results,  it  is  apparent  that  a  resolution  of  the  two  equal  aaplitude 
sinusoids  was  not  achieved  for  a  thirty  second  order  MA  aodel*  but*  was 
achieved  for  a  sixty  fourth  order  MA  aodel.  Thus,  an  artificially  high  order 
MA  aodel  was  required  in  order  to  resolve  the  two  sinusoids  when  exaot 
autocorrelation  lags  were  used.  This  exaaple  nicely  deaonstrates  the 
distortions  which  can  result  when  invoking  a  MA  aodel  if  the  underlying 
assuaption  that  rz(n)  ”  0  for  n  >  q  thereby  iaplied  is  not  satisfied  (or 
approxiaately  satisfied).  Clearly*  the  nondaaped  nature  of  the 

autocorrelation  sequenoe  (3.2)  behavior  indicates  that  the  MA  aodeling  of  a 
tiae  series  ooaposed  of  sinusoids  in  white  noise  can  be  inappropriate  unless 
a  sufficiently  large  selection  of  the  MA  aodel  order  q  is  Bade. 

AR  Models:  Ve  next  used  the  saae  autocorrelation  lag  inforaation  (3.11) 
to  generate  AR  aodels  of  order  p  ■  20  and  p  «  24  using  expression  (2.11) . 
The  resultant  spectral  estiaates  1/ lAp(«j«»)  |2  are  shown  in  Figure  3.2a  and  b 
for  these  two  aodel  order  choices.  It  is  apparent  that  the  twentieth  order 
aodel  was  unable  to  resolve  the  two  sinusoids  while  the  twenty- fourth  was 
just  able  to  achieve  the  resolution.  Sinoe  the  speoific  autocorrelation  lags 

rx(n)  for  0  £  n  £  p  were  required  for  generating  an  AR(p)  aodel*  it  is 
apparent  that  fewer  autocorrelation  lags  were  needed  to  resolve  the  two 
sinusoids  when  using  an  AR(24)  aodel  in  coaparison  to  the  MA  aodel.  This 
sinqply  gives  credence  to  the  previously  Bade  suggestion  that  AR  aodels 
provide  a  aore  effective  instruaent  for  representing  peak  like  spectra  than 
are  MA  aodels. 

In  order  to  illustrate  the  effeot  of  using  aore  than  the  ainiaal  nuaber 
of  extended  Yule-Walker  equations  (i.e.*t  >  p)  when  generating  an  AR  aodel* 
we  next  used  the  ARMA  aodeling  equations  (2.23)  with  psraaeters  p»10,  q-0, 
and  t»100.  The  AR(10)  aodel  which  results  upon  solving  equations  (2.23)  for 
this  ohoiee  of  order  psraaeters  has  a  speotral  behavior  as  depicted  in  Figure 
3.2o.  This  AR(10)  speotral  estiaate  is  seen  to  be  uignifioantly  better  than 
that  achieved  by  the  higher  order  AR(24)  estiaate.  Clearly*  the  process  of 
using  100  (i.e.*  t-100)  extended  Yule-Walker  equation  evaluations  instead  of 
the  ainiaal  nuaber  10  has  produced  this  significant  iaproveaent.  This 
iaprovsaent  is  due  to  the  faot  that  only  the  first  four  of  the  one  hundred 


extended  Yule-Walker  equation  evaluations  are  in  error  due  to  the  inposition 
of  an  inproper  AR  nodel  (see  equation  (3.3)).  By  increasing  t  beyond  p,  the 
effect  has  been  to  dilute  the  negative  inpact  of  the  erroneous  first  four 
Yule-Walker  equations  on  the  nodel  paraneters  ( i.e. ,  four  inproper  equations 
and  96  appropriate  equations)*.  The  reader  is  urged  to  fully  understand  the 
duplications  of  this  result  in  a  nore  broadly  based  context. 

ARMA  Model:  Ve  next  used  the  given  autocorrelation  lag  infomation 

(3.11)  to  generate  an  ARMA  nodel  of  order  p  *  4  by  appealing  to  expression 
(2.23) .  Ve  here  seleot  the  variable  t  to  be  equal  to  its  nininal  value  of 
four,  and.  in  accordance  with  this  section's  discussion  take  q  ■  4.  The 
resultant  ARMA  based  spectral  nodel  l/lA4(ejM)P  without  the  MA  eonponent  is 
plotted  in  Figure  3.3.  The  two  sinusoids  are  nicely  resolved  and  when  the 
fourth  order  polynouial  AgfeJ*)  was  factored,  it  was  found  to  have  its  four 
roots  on  the  unit  circle  at  eii^nffc  for  k  *  1,2  in  which  ■  0.2  and  f2  “ 
0.215.  This  should  not  be  surprising  since  it  was  previously  shown  in  this 
section  that  an  ARMA  type  nodel  is  perfectly  conpatible  with  a  sinusoids  in 
white  noise  tine  series  (MA  and  AR  nodels  are  not  conpatible).  It  is 
noteworthy  that  only  the  autocorrelation  lags  rz(n)  for  1  £  n  £  8  were 
required  in  generating  the  spectral  nodel  depicted  in  Figure  3.3. 

Alternative  Method:  As  a  final  procedure,  we  used  the  alternative 
wethod  as  represented  by  relationship  (3.9)  in  which  the  paraneters  were 
taken  to  be  p>"10  and  t»50.  Using  these  paraneters  along  with  the  theoretical 
autocorrelation  lag  entries  (3.11)  a  plot  of  the  resultant  estinate 
1/  |Aio(eJw)  |2  is  shown  in  Figure  3.4.  The  two  sinusoids  are  resolved  with 
well  defined  peaks,  and,  the  spectral  estinates  are  superior  to  those 
achieved  by  the  MA  and  AR  nodel  results  but  inferior  to  the  ARMA  nodel. 


37 


Fig.  3.3  Autoregressive-moving  average 
(ARMA)  spectral  models  using 
expression  (2.23)  with  exact 
autocorrelation  lags  and  p=t=4. 


I 


Fig.  3.4  Alternative  method  with 
p  *  10  and  t  =  50. 
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IV.  MA  Modeling  -  Tine  Series  Observations 

Fron  a  practical  viewpoint,  the  situation  in  which  exact  autocorrelation 
lag  values  are  given  for  effecting  a  spectral  estimate  alaost  never  arises. 
More  typically,  the  required  spectral  estiaate  is  to  be  generated  fron  a 
finite  set  of  contignons  tiae  series  observations  as  represented  by 

x(l),  x(2) ,...,  x(N)  (4.1) 

In  this  section,  we  will  be  concerned  with  achieving  MA  spectral  estiaates 
fron  this  observation  set.  The  aethods  to  be  presented  for  this  purpose  are 
largely  influenced  by  the  theoretical  developments  found  in  Section  II. 

There  exist  two  primary  MA  spectral  estiaation  procedures  that  have 
found  favor  aaong  users.  They  are  indireot  aethods  based  on  autooor-relation 
estiaates  such  as  proposed  by  Blaokaan  and  Tukey  [8] ,  and,  direct  aethods 
based  on  the  Fonrier  transfora  of  the  tiae  series  observations  and  widely 
known  as  the  periodogrsa  (or  the  method  of  averaged  periodograas  due  to  Welch 
[64]).  As  we  will  shortly  see,  the  periodogrsa  is  a  special  case  of  the 
Blackaan-Tnkey  approach. 


B1 ackaan-Tuke v  Approach 

In  the  Blsekasnr Tukey  aethod,  one  first  obtains  autocorrelation 
estiaates  rz(a)  froa  the  given  observation  set  (4.1).  These  autocorrelation 
estiaates  are  then  inserted  into  expression  (1.2)  to  effect  the  required 
spectral  estiaate.  For  a  variety  of  reasons,  it  is  often  beneficial  to 
introduce  a  windowing  sequence  w(n)  to  achieve  the  windowed  MA  spectral 
estiaate  of  order  q 

q 

S(ej*»)  -  ^  w(n)  rx(n)e-j«l  (4.2) 

Considerations  to  be  aade  in  selecting  the  window  sequence  are  well 
doeuaented  and  the  reader  is  referred  to  references  [32] , [50] ,[57] .  Two  of 
the  aore  popular  selections  are  the  rectangular  window  (i.e.,  w(n)  “  1)  and 
the  Bartlett  triangle  window  (i.e.,  w(n)  -  (l-|n|)/(q+l)) . 

The  standard  unbiased  and  biased  autocorrelation  estiaates  are  aaong  the 
aost  popular  candidates  to  be  used  in  the  spectral  estiaute  (4.2)  (e.g.,  see 
ref.  [33]  for  a  detailed  developaent) .  The  unbiased  estiaate  achieves  the 
required  autocorrelation  lag  estimate  according  to 
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N 

rx(n)  ■  — i ^  x(k+n)x(k)  -q£  n  £  4  (4.3) 

N-lnl 

vhera  the  convention  of  setting  to  zero  any  ten  x(n)  in  the  summand  for 
which  n  s  [1,N]  is  adopted.  It  is  a  siaple  natter  to  show  that  E(rz(n)}  ■ 
rz(n)  thereby  establishing  the  unbiased  nature  of  eatinate  (4.3).  Moreover, 
this  unbiased  estimate  is  also  consistent  so  long  as  the  order  paraaeter  q  is 
finite. 

Notwithstanding  the  obviously  attractive  statistical  properties 
possessed  by  the  unbiased  estimate  (4.3),  a  number  of  prominent  statisticians 
have  proposed  using  the  standard  biased  estimate  (e.g.,  see  refs. 
[33] ,[52] ,[33] . 

N 

rz(n)  -  ^  ^  x(k+n)x(k)  -q  £  n  ±  q  (4.4) 

k-1 

Ve  again  adhere  to  the  convention  of  setting  to  zero  any  ten  x(n)  in  the 
summand  for  which  n  s  [1,N].  The  justification  for  using  the  biased  estimate 
is  that  it  is  more  stable  statistically.  It  must  be  noted,  however,  that  the 
relative  advantages  of  unbiased  vs.  biased  estimators  remains  an  unsettled 
issue.  With  this  in  mind,  the  user  is  cautioned  to  base  his  ultimate 
selection  on  the  particular  application  being  considered.  This  will 
uadoubtably  entail  a  great  deal  of  empirically  based  experimentation  on  the 
users  part. 


Periodotram 

In  the  periodogram  method,  the  required  spectral  estimate  is  given  by 
the  expression 


Ss(e4«)  -  g  ll«(eJ‘»)|l 


(4.5) 


where  %(eJ»)  is  the  Fourier  transf on  of  the  time  series  observations,  that 
is 
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(4.6) 


N-l 

%(eJ**)  *  ^  x(n+l)e“j«a 

n-0 

We  here  nee  the  subscript  N  on  I^(ejw)  to  explicitly  denote  its  dependency  on 
the  obserrstion  length  parameter.  It  is  readily  shown  that  the  periodogram 
is  identical  to  the  Blsckmaar-Tukey  approach  when  the  biased  estimates  (4.4) 
are  need  in  expression  (4.2)  with  q  “  N-l  and  w(n)  -  1. 

The  primary  advantage  in  using  the  Periodogram  approach  is  computational 
in  nature.  Specifically,  the  values  of  the  periodogram  at  the  N  discrete  set 
of  uniformly  spaced  radian  frequencies  ■  2nk/N  for 
Oiki  N-l  is  seen  to  entail  evaluation  of  the  entities 

N-l 

2nk  T  2nkn 

%(eJ  IT)  "  l  x(n+l)e"J  “5“  »  0  i  k  i  N-l  (4.7) 

n**0 

These  evaluations  are  readily  carried  out  by  use  of  the  N  point  fast  Fourier 
transform  (FFT)  algorithm  (e.g..  see  refs.  [50],  [57]).  With  the  FFT 
algorithm,  the  N  quantities  (4.7)  may  be  computed  in  which  the  required 
number  of  complex  additions  and  multiplications  is  on  the  order  of  N  log^N. 
The  computational  savings  accrued  in  using  the  FFT  algorithm  for  spectral 
estimates  is  considerable  when  it  is  realised  that  a  direct  evaluation  of 
expression  (4.7)  is  seen  to  entail  N*  complex  additions  and  multiplications. 
Due  to  the  computational  savings  aoerued  in  using  the  FFT  implementation  of 
the  periodogram,  speotral  estimates  of  long  data  sequences  beoame  feasible 
with  the  FFT' s  development. 

Although  the  FFT  algorithm  offers  a  computationally  effijient  means  for 
numerically  evaluating  the  periodogram  (4.5) ,  it  possesses  a  potentially 
serious  drawback.  Specifically,  as  just  suggested,  this  FFT  implementation 
provides  a  sampled  version  of  the  periodogram  in  which  the  frequency  samples 
are  separated  by  2n/N  radians.  For  many  applications  of  interest,  this 
sampling  may  be  too  coarse  in  that  the  detailed  continuous  frequency  behavior 
of  the  periodogram  (4.5)  may  be  somewhat  obscured  through  the  sampling 
process.  An  example  of  this  will  be  given  in  Section  Fill.  In  order  to 
alleviate  this  potential  difficulty,  we  may  apply  the  concept  of  xero 
oaddlna.  This  simply  entails  th«  appending  of  L  zeroes  to  the  given  set  of 
time  series  observations,  that  is 
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x(l),  x(2) »  ....  x(N) ,  0.  0.  ....  0  (4.8) 

L  zeroes 

where  L  is  s  yet  unspecified  positive  integer.  If  we  were  to  take  the 
Fourier  transfora  of  this  padded  tiae  series,  we  would  obtain  the  saae 
transfora  (4.0  and  the  saae  periodograa  function  (4.5).  On  the  other  hand, 
if  ve  were  to  take  a  N+L  point  FFT  of  this  padded  tiae  series,  the  following 
wore  finely  spaced  ssaples  of  the  Fourier  transfora  would  be  generated 

N-l 

.  2nk  \  ,  ...  _ j  2nkn 

<*  *  Jo  i^T  0^<N+I-  <4'»> 

If  these  ssapled  values  were  then  substituted  into  expression  (4.5).  we  would 
obtain  saapled  values  of  the  periodograa  at  the  more  finely  spaced 
frequencies  *  2n/(N+L)  for  0£k<N+L.  The  effeot  of  the  L  zero  padding  is 
then  seen  to  result  in  a  reduction  of  the  frequency  saapling  interval  froa 
2jt/N  to  2n/(!|H-L)  .  By  selecting  L  suitably  large,  ve  can  reduce  this  saapling 
interval  to  any  degree  desirable. 

One  should  not  gain  the  aistaken  iapression  that  padding  will  enable  us 
to  achieve  any  degree  of  frequency  resolution  desired.  The  fuadaaental 
unsaapled  periodograa  (4.5)  has  an  inherent  frequency  resolution  capability 
of  Am  "  2it/N  (or  equivalently  Af  -  1/N) .  When  using  a  N  point  FFT 
iapleaentation  of  the  periodograa.  however,  it  is  entirely  possible  that 
speotral  peaks  nay  lie  between  the  saapled  frequencies  m^  »  2nk/N.  In  such 
cases,  the  peaks  effeot  on  the  saapled  periodograa  aay  be  seriously  diluted 
even  though  it  would  be  clearly  evident  in  the  unsaapled  periodograa.  Upon 
padding  with  L  zeroes,  we  oan  raaove  the  aabiguity  caused  by  this  saapling 
process  and  still  retain  the  ooaputational  efficiency  of  an  FFT 
inpl  eaenta  ti  on. 
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V.  AS  Modeling  -  Tine  Series  Observations 

The  task  of  generating  AS  speotral  nodal  a  fro*  a  set  of  tine  series 
observations  has  been  of  primary  oonoern  to  many  investigators  over  the  last 
fev  years.  Undo nbt ably,  the  most  widely  used  AS  modeling  procedure  is  the 
Burg  algorithm  as  first  proposed  in  1967  [13].  This  algorithm  not  only 
provided  a  speotral  estimation  capability  that  was  theretofore  lacking,  it 
also  inspired  an  intense  search  for  improved  rational  spectral  estimation 
procedures.  Much  of  contemporary  speotral  estimation  theory  has  been 
direetly  influenced  by  the  philosophy  contained  within  the  Burg  approach.  As 
a  matter  of  fact,  many  of  the  more  recent  rational  estimation  procedures  were 
developed  so  as  to  overcome  some  of  the  defioienoies  observed  in  the  Burg 
slgorithm  as  typified  by  line  splitting  and  biased  frequency  estimates. 
Nonetheless,  the  Burg  algorithm  still  occupies  the  pre-eminent  position  among 
contemporary  AK  modeling  methods.  Since  its  operational  behavior  is  so  well 
documented,  we  refer  the  interested  reader  to  the  relevant  literature  for  its 
detailed  development  (e.g.»  see  refs.  [23]. [31]). 

In  this  section  and  section  IX.  we  will  demonstrate  that  many  of  the 
popularly  used  AK  methods  (which  inoludes  the  Burg  algorithm)  may  be 

interpreted  as  providing  statistical  estimates  of  the  fundamental  Yule-Walker 
equations  (2.11)  that  govern  AK  processes.  These  estimates  are  to  be 
obtained  from  the  set  of  contiguous  time  series  observations 

x(l) ,  x(2) .  ....  x(N)  (5.1) 

whioh  are  made  available  through  some  measureaient  mechanism.  More 

speoif ioally.  it  is  well  known  that  various  contemporary  methods  either 
explicitly  or  implicitly  use  these  observations  to  generate  estimates  of  the 
(p+l)x(p+l)  autocorrelation  matrix  R  which  appears  in  the  fundamental 

relationship  (2.11).  Clearly,  the  elements  of  the  matrix  estimate  k  must  be 
such  that 

R(i,j)  is  an  estimate  of  rx(i-j)  for  1  £  i.j  £  p+1  (5.2) 

Once  these  estimates  have  been  oomputed  from  the  given  time  series 

observations,  the  resultant  autoregressive  parameter  vector  estimate  is.  in 
accordance  with  expression  (2.11).  obtained  by  solving  the  linear  system  of 
equations. 

*a-M>oI2*i  (5.3) 
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in  which  the  normalizing  parameter  b0  is  selected  so  that  the  first  component 
of  &  is  one.  The  steps  of  this  general  AR  modeling  approach  are  anmmarized 
in  Table  5.1-. 


Step  1:  Compute  Estimates  of  R(i,j)  ”  rx(i-j)  for  1  .£i,j,£p+l  to  form 
the  (p+l)x(p+l)  autocorrelation  matrix  estimate  R.. 


Step  2:  Solve  the  linear  system  of  equations  Ra  “  |b0|2  oji  in  which  the 
normalising  coefficient  b0  is  selected  so  that  the  first 
component  of  i  is  one. 

Step  3:  The  required  AR(p)  spectral  estimate  is  then  specified  by 

S^eJ**)  -  I  - - !!• - 

I  1  +  ai  e"J“  +  ...  +  «p  e”JP** 


Table  5.1  Basic  steps  in  obtaining  an  AR(p)  spectral  estimate. 


The  quality  of  the  AR  modeling  approach  as  embodied  in  expression  (5.3) 
is  critically  dependent  on  the  choice  of  the  autocorrelation  lag  estimation 
procedure  used.  For  many  applications,  the  standard  unbiased  autocorrelation 
estimates  as  given  by 

N 

R(i,j)  -  - 1 -  5  *<k+i-j)x(k)  1  i  i  i  P+1  (5.4) 

N-li-jl  ktx  lijiP+1 

typically  provides  the  best  selection  in  terms  of  spectral  estimation 
performance.  It  is  seen  that  the  autocorrelation  matrix  formed  from  this  set 
of  estimates  will  be  Toeplitz  and  symmetric;  properties  shared  by  the  aetual 
autocorrelation  matrix  being  approximated.  Moreover,  this  estimate  is 

A 

consistent  in  the  sense  that  as  N  approaches  infinity,  we  have  R  — ►  R  under 
the  second  order  ergodic  assumption  on  the  time  series.  In  view  of  all  of 
these  favorable  qualities,  it  is  not  surprising  that  the  standard  unbiased 
estimator  (5.4)  generally  provides  excellent  AR  modeling  performance.  In 
Section  IX,  some  of  the  more  popularly  used  adaptive  methods  of  AR  speotrsl 
estimation  will  also  be  studied. 


43 


VI.  ARMA  Modeling:  Tine  Series  Observations 

The  methods  for  generating  ARMA  nodels  based  upon  times  series 
observations  fall  into  basically  two  categories:  the  a^  and  bfc  parameters 
are  either  evalnated  (i)  simultaneously  or  (ii)  separately.  In  the  first 
oategory,  maximum  likelihood  based  techniques  form  one  of  the  most  widely 
used  of  snoh  methods.  These  include  exact  maximum  likelihood  approaches 
(e.g.,  refs  [6]  and  [48]),  and.  least  square  methods  vhioh  approximate  the 
exact  likelihood  function  (e.g.,  refs  [3]*  [9].  [29]).  Although  offering  the 
promise  of  optimum  modeling,  these  maximum  likelihood  methods  entail  the 
application  of  nonlinear  programming  solution  procedures.  As  such,  these 
solution  procedures  are  computationally  inefficient,  and.  they  suffer  the 
obvious  drawbacks  characteristic  of  nonlinear  progrmaming  methods.  Other 
nonmaximum  likelihood  methods  which  fall  into  category  (i)  have  been  proposed 
(e.g..  see  refs.  [30] ,[40] ,[60] ) .  These  methods  also  entail  the  utilisation 
of  nonlinear  programming  solution  procedure*. 

In  recognition  of  the  obvious  shortcomings  of  nonlinear  programming 
based  techniques,  a  number  of  methods  have  been  proposed  which  employ  a 
separate  evaluation  of  the  AS  and  MA  parameters.  By  using  this  approach,  it 
is  generslly  possible  to  obtain  satisfactory  modeling  while  not  incurring  the 
drawbacks  of  a  nonlinear  programming  solution  procedure.  These  techniques 
typically  entailed  using  the  first  p  extended  Yule-Valker  equations  to  obtain 
estiamtes,  in  a  linear  fashion,  for  the  AS  parameters  (e.g.,  see  refs 
[26] ,[28] ,[35] ,[38] ) .  Unfortunately,  the  utilization  of  the  minimal  number 
of  extended  Yule-Valker  equations  (i.e.,  p)  gave  rise  to  an  undesirable 
parameter  hypersensitivity.  In  recognition  of  this  fact,  a  procedure  for 
using  a  overdetermined  set  of  Yule-Valker  equation  evaluations  to  deorease 
this  hypersensitivity  was  proposed  [13] .  This  approach  has  since  been 
adopted  by  other  researohers  in  spectral  estimation  applications  with  sueeess 
(e.g.,  see  refs.  [71 »C12] ,[36] ,[S1] ) .  Vith  this  in  mind,  we  shall  now  give  a 
detailed  development  of  the  overdetermined  approach  to  estimating  the  AR 
parameters  of  an  ARMA  model. 


AR  Parameter  Estimation 


Although  the  procedure  presented  in  Section  II  for  generating  ARMA 
models  is  attractive,  one  is  rarely  provided  vith  exaot  autocorrelation 


I 


information.  Hie  more  common  situation  is  one  in  which  the  only  available 
information  takes  the  form  of  a  finite  set  of  time  series  observations 

x(l),  x(2) »  ....  x(N)  (6.1) 

The  task  at  hand  is  to  then  nse  these  time  series  observations  to  estimate 
the  parameters  of  a  postulated  ARMA  model,  la  this  parameter  estimation,  we 
shall  seek  to  incorporate  the  philosophy  as  embodied  in  the  extended 
Yule-Valker  ARMA  model  equations  (2.23)  for  estimating  the  model's  a^ 
parameters. 

This  will  effectively  entail  using  the  given  time  series  observations  to 
generate  an  estiawte  of  the  tx(p+l)  autocorrelation  matrix  R^  which  appears 
in  expression  (2.24) .  Namely,  using  any  of  a  number  of  available  procedures, 
we  first  compute  the  following  autocorrelation  lag  estimates 

A 

*l(i»j)  ■  an  estimate  of  rx(q+l+i-j)  1  £  i  £  t  (6.2) 

liJiP+1 

Two  particularly  attractive  procedures  for  effecting  these  autocorrelation 
estimtaes  will  be  detailed  at  the  end  of  this  section  and  in  Section  X. 
Independent  of  what  procedure  is  eventually  used,  the  net  result  of  this 
first  step  will  be  the  generation  of  a  tx(p+l)  autocorrelation  matrix 

A 

estimate  Rj.  Due  to  errors  inherent  in  the  autocorrelation  estimation 
process,  however,  this  matrix  estimate  will  generally  have  full  rank  (i.e., 
min  (p+l.t))  instead  of  the  theoretical  rank  p  which  is  possessed  by  the 
matrix  Rj  being  estimated.  This  being  the  case,  it  is  therefore  not 
generally  possible  to  find  an  autoregressive  parameter  vector  with  first 
component  equal  to  one  which  will  satisfy  the  theoretical  relationship  R|a  “ 
o  as  given  in  equation  (2.23).  As  such,  the  txl  extended  Yule-Valker 
equation  error  vector  as  specified  by 


e  -  R !2  (6.3) 

will  be  generated. 


A  little  thought  will  convince  oneself  that  the  elements  of  this  error 
veotor  will  be  composed  of  a  sum  of  many  random  variable  products  (i.e. » 
x(k+m)f(m))  used  in  formulating  the  autocorrelation  lag  estimatee. 
Consequently,  an  assumption  that  the  error  vector  elements  tend  to  be 
Gaussianly  distributed  is  u  reasonable  one.  The  joint  density  function  of 
the  extended  Yule-Valker  equation  error  veotor  may  be  therefore  approximated 
by 
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P<£> 


l-ii/a 

|T|  e-0.5(**f*) 


(2it) 


t/2 


($.4) 


in  vbieb  f"l  ■  E(ae»]  d«tl|ott«i  tkf  error  toririus*  utrii  vUob  is 
generally  unknown  and  where  the  expected  value  of  |  is  taken  to  be  zero. 

With  the  availability  of  tfca  error  joint  density  fane t ion  (4.4)*  it  is 
now  possible  to  apply  the  maximum- likelihood  concept  for  estimating  the 
autoregressive  parameters.  Namely,  making  use  of  relationship  (6J)  and  the 
joint  density  function  (4.4),  it  is  possible  to  generate  a  joint  density 
function  for  the  autorgressive  parameter  veetor  a  which  will  be  of  form 

p(h)  ■  ye~0 .5(g*N*fBi) 

Ve  now  seek  that  vector  a  which  maximises  this  joint  density  function  subjeet 
to  the  constraint  that  the  first  component  of  |  be  one.  Ignoring  the  effect 
of  the  multiplicative  term  y,  the  psuedo  maximum- likelihood  selection  for  g 
then  corresponds  to  solving  the  following  constrained  minimisation  problem 

min  a*R«WRa  (4.5) 

a(l)-l 

Using  standard  Lagrange  multiplier  techniques,  the  solution  to  this 
constrained  minimization  problem  is  obtained  by  solving  the  following  system 
of  (p+l)x(p+l)  linear  equations 

**1  I  li  I  "  «  ii  (4.4) 

where  a  is  a  normalising  constant  selected  so  that  the  first  component  of 
A*  is  one. I  Expression  (4.4)  constitutes  the  so-called  hish  mlmyn 
method  of  autoregressive  parameter  selection  [15] —[20] . 

It  is  to  be  noted  that  in  minimising  functional  (4.5)  with  respect  to 
the  normalisation  constraint  imposed  on  g'l  first  component,  the  error  veotor 
is  being  minimized  in  the  least  squares  sense.  In  effect,  we  are  then 
selecting  &  so  as  to  best  satisfy  the  theoretical  relationship  (2*23)  given 
by  kjj|  "  i.  Using  this  interpretation,  the  positive  definite  matrix  T  can  be 


1  A  -  a 

In  those  rare  cases  where  the  (p+l)x(p+l)  matrix  is  singular,  the 

autoregressive  parameter  veetor  will  correspond  to  a  suitable  normalised 
eigenvector  associated  with  a  zero  eigenvalue  of  this  matrix. 
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alternatively  thought  of  aa  providing  a  weighting  (instead  of  being  an 
unknown  oovarianoe  matrix  inverse)  in  the  error  funotional  (6.5).  It  is 
therefore  logical  to  take  T  to  be  a  diagonal  matrix  whose  nonnegative 
diagonal  entries  w^  for  k  ■  1,2,  ...»  t  provide  a  neehanisa  for  weighting  in 
any  desirable  fashion  the  various  extended  Tule-Valker  equation 
approximations  appearing  in  (6*3).  The  uniform  weighting  selection 

W  -  I  (6.7) 

where  I  is  the  txt  identity  matrix  has  been  found  to  provide  excellent 
modeling  performance  when  the  matrix  estimate  R^  it  unbiased. 

A  few  words  are  now  appropriate  concerning  the  selection  of  the  integer 
t  which  specifies  the  number  of  extended  Yule-Walker  equations  that  are  being 
approxisuited.  When  t  is  set  equal  to  its  minimal  value  p,  the  approach  here 
taken  bears  a  close  resemblance  to  various  other  ARMA  modeling  sohemes  (evg., 
see  refs.  [26] , [281 ,[35] ,[38] ) .  In  this  oase,  the  minimal  number  of  p  error 
contaminated  extended  Yule-Walker  equation  evaluations  are  being  used  in 
fixing  the  model's  p  autoregressive  coefficients.  A  little  thought  should 
convince  oneself  of  the  potential  parameter  hyper aensitivity  which  can  ariae 
in  this  situation.  To  illustrate  this  point,  let  us  briefly  eonsider  the 
task  of  finding  a  line  which  'best'  fits  a  set  of  error  contaminated 
two-tuples  (xj.,jq).  Although  only  two  two-tuples  are  needed  to  fix  the 
line's  two  parameters  (i.e. ,  its  slope  and  y  intercept),  it  will  be  generally 
more  desirable  to  fix  these  parameters  by  using  more  than  this  minimal  number 
of  two-tuples  thereby  obtaining  a  more  'representative  linear  fit'.  This 
will  entail  finding  the  'best  least  squares  linear  fit'.  The  benefits 
generally  accrued  in  using  this  overdetermined  approach  are  demonstrated  in 
Figure  6.1. 

With  the  above  in  mind,  the  real  advantage  of  this  paper's  approaoh  is 
achieved  when  the  integer  t  is  selected  to  be  larger  than  p.  In  this  oase, 
more  than  the  minimal  number  of  extended  Yule-Walker  equation  evaluations 
(i.e.,  t  instead  of  p)  are  being  used  in  fixing  the  model's  p  autoregressive 
coefficients.  It  is  then  not  surprising  that  a  desirable  decrease  in 
parameter  hypersensitivity  is  generally  realised  upon  seleoting  t  >  p.  An 
indication  of  the  benefit  aeorued  by  seleeting  t  >  p  was  illustrated  in 
Section  III  for  the  oase  of  AR  modeling  with  perfect  autocorrelation  lag 
values.  A  similar  advantage  will  be  demonstrated  in  Seetion  VIII  when  ARXA 
models  are  generated  from  raw  time  series  observations.  In  the  situation 
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being  considered  here,  the  integer  psreaeter  t  is  typically  selected  to  lie 
within  the  rsnge 

piti  N-q-1  (6.8) 

with  generally  lsrser  rslnes  than  the  ainiaoa  p  being  preferred  for  aodeling 
fidelity. 

Fraa  an  owerall  aodeling  viewpoint,  the  standard  unbiased  estiaator  has 
been  found  to  generally  provide  the  beat  ohoioe  for  the  lag  estiaates 
required  in  expression  (6.2) .  Specifically,  the  required  autocorrelation  lag 
estiaate  entries  are  generated  according  to 

N-n 

A  1  5  x(k+n)*(k)  0£n£q+t  (6.9) 

rx(n)  -  -  L 

N-n  k-1 

where  q+t  corresponds  to  the  largest  autocorrelation  lag  arguaent  appearing 

A 

in  aatrix  Rj .  We  would  of  course  use  the  property  that 

*x(-a)  ”  rx(n)  to  obtain  any  negative  lag  autocorrelation  entries  which  any 

A 

be  needed  in  foraulating  R* .  In  using  this  unbiased  estiaate  approach,  the 
resultant  autocorrelation  aatrix  estiaate  will  have  a  desirable  Toeplitz 
structure. 

A 

The  (p+l)x(p+l)  aatrix  RjWRi  ,  which  ooapletely  characterises  the 
autoregressive  paraaeter  vector  solution  through  expression  (6.6),  will  have 
components  which  are  readily  coaputable  froa  the  estiaates  (6.9).  Using 
siaple  aatrix  aanipulations,  it  is  readily  shown  that  the  general  (i,j)tb 
eleaent  of  this  aatrix  is  specified  by 

t 

*lW%(i,j)  ■  ^  w(a)r(q+»*l)  r(q+wH-i-j)  for  1  £  i,j  ip+1  (6.10) 

where  the  w(a)  correspond  to  the  diagonal  eleaents  of  the  diagonal  weighting 

A*  A 

aatrix  V.  Upon  generation  of  the  aatrix  RjVR}  according  to  this  expression, 
the  required  autocorrelation  paraeter  vector  is  straightforwardly  obtained 
by  solving  the  systea  of  linear  equations  (6.6).  A  Fortran  prograa  listing 
of  an  inpl eaentation  of  this  procedure  is  given  in  the  appendix  where  the 
flexibility  of  using  the  standard  unbiased  or  the  standard  biased  (i.e., 
divisor  N-n  in  equation  (6.9)  is  replaced  by  N)  autocorrelation  estiaate  is 
available. 
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HA  Ptruff  r  Bftlwttlpn 

To  eaaplete  the  ARHA  modeling,  it  it  necessary  to  eoapute  an  ettiaate 
for  the  aoving  average  component  |Bq(oj w) |2.  It  haa  been  the  author * a 
experience  that  independent  of  which  procedure  ia  need,  thia  MA  ooaponent 
eatiaate  ia  alaoat  always  of  aignif icantly  lower  quality  than  the  aaaooiated 
AS  ooaponent 

P 

|Ap(eJu)l2  "  |  $  *k  (6.11) 

k-0 


in  whioh  «k  denote  the  antoregre stive  paraaeter  eatiaates  as  generated  froa 
expression  (6.6).  A  high  quality  low  order  MA  spectral  estiaator  haa  yet  to 
be  developed.  Despite  thia  shortcoming,  soae  reasonably  well  perforaing  MA 
estiaators  will  now  be  briefly  disousscdv 

Many  contemporary  MA  ooaponent  estiaators  are  baaed  on  utilising  the 
forward  nd  backward  residual  tiae  aeries  aaaooiated  with  an  AMMA  tiae 
series.  In  particular,  the  forward  residual  tiae  series  elements  are 
coaputed  froa  the  given  observations  (6.1)  and  the  autoregressive  p&raaeter 
estiaates  (6.6)  according  to 

P 

»f (n)  *  ^  ak  x(n-k)  p+1  i  n  iN  (6.12) 

k-0 

Siailarly,  the  backward  residuals  coapoaent  are  generated  using 
P 

*k(n)  ■  ^  »k  *(®+k)  li#i  N-p  (6.13) 

k-0 

As  indicated  ia  Section  II,  each  of  these  residual  tiae  series  will  be 
governed  by  the  saae  MA(q)  process  if  the  tiae  series  (x(n))  is  an  AIMA(p,q) 
process  with  autoregressive  paraaeters  ak.  With  this  ia  Bind,  a  procedure 
for  extracting  this  MA  characterisation  froa  the  coaputed  forward  and 
backward  residuals  will  now  be  given. 

The  aost  direct  procedure  for  achieving  the  required  MA(q)  estiaate  is 
to  first  generate  the  following  estiaates  of  the  residual  tiae  series*  first 
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It  is  also  possible  to  employ  tbs  smoothed  periodogram  to  obtain  another 
form  of  M ih(q)  estimate.  This  entails  segmenting  the  oompnted  residuals  in 
bloeks  of  length  q+1  (overlapping  or  not  overlapping)  and  then  averaging  the 
resultant  q+1  length  peri odogr sms  for  each  of  these  blocks.  This  procedure 
has  been  enployed  with  a  moderate  degree  of  success  [17K  Similarly,  ve 
could  make  obvious  adaptions  of  the  procedures  treated  in  Section  II  under 
the  A&JIA  modeling  subsection  to  aohieve  alternate  Hi  estimator.  For  example, 
if  ve  were  to  use  the  procedure  as  characterised  by  expreasion  (2.38), 
estimates  for  the  Cj|  parameters  would  be  computed  from 

n-1 

°n  ■  $  *k  *x(*“k)  1  £  n  £  p  (8.18) 

k*0 

The  required  ARMA  spectral  estimates  would  then  be  given  by  incorporating 
these  estisuites  into  expression  (2.33)  to  result  in 

A 

S*(eJ*»)  -  rx(0)  +  2Re  [D(eJ*»)]  (8.19) 

where  D(eJ«)  is  obtained  by  substituting  the  a^  and  c*  estimates  into  form 
(2.34). 


q+1  autocorrelation  lag* 


**(n)  - 


N-p-n 


J  £*f(n+p+k)sf(p+k)  +  «b(n+k)  ?b(k)]  .  0£n£q 

k-1 


(6.14) 


If  th«  residual  time  saris*  do  ia  faet  correspond  to  a  MA(q)  process,  it  will 
be  found  that  the  rg(n)  will  be  approximately  aero  for  n^.q+1,  This  oaa  be 
used  as  a  convenient  test  for  the  appropriateness  of  the  ARMA  nodal*  the 
order  selection,  and.  the  estimates  a^.  In  any  oase.  upon  taking  the  Fourier 
transforn  of  these  autocorrelation  lags,  we  obtain  the  MA(q)  spectral 
estinate  eonponent 

lBq(*j»)|2-  J  w(n)  r,(n)e-J*«  (6.15) 


in  which  w(n)  is  a  window  sequence  and  use  of  the  fact  that  rg(-*)  ■  rg(n) 
will  be  aade  when  evaluating  (6.14).  The  overall  ARMA(p.q)  spectral,  estinate 
is  then  given  by 


S(ej»)  - 

ip(ej»)|2 


(6.16) 


A 

where  Ap(*j»)  is  specified  by  expression  (6.11). 

A  few  words  are  now  appropriate  concerning  the  selection  of  the  window 
to  be  used  ia  estinate  (6.14).  If  the  rectangular  window  ohoice  w(n)  ■  1  is 
nade.  this  estimate  will  not  have  the  desired  property  of  being  guaranteed 
poeitive-senidef iaite.  To  achieve  this  positive^semidefiaiteaess,  we  could 
instead  choose  the  window  to  be 


(6.17) 


Unfortunately,  this  selection  can  give  rise  to  a  seriously  distorted  MA 
estimate  ia  view  of  the  triangular  like  weighting  thereby  snployed.  The 
selection  of  w(a)  is  quit*  important  and  this  choice  should  be  based  on  the 
particular  application  at  hand  and  user  experience. 
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VII.  AUtt  Modeling:  A  Singular  Value  Decomposition  Approach 

Ve  have  yet  to  address  the  important  issue  of  AKMA  model  order 
determination.  In  particular,  whether  one  is  provided  with  ezaot 
autocorrelation  lags  or  time  series  observations  for  effecting  the  modeling, 
how  one  chooses  appropriate  values  for  the  order  parameters  p  and  q  remains 
an  open  question.  It  is  recognised  that  this  model  order  information  is 
illicitly  contained  in  the  autocorrelation  matrices  which  characterise  ARMA 
models.  In  this  section,  we  shall  present  a  procedure  for  extracting  the 
prerequisite  model  order  values  which  will  make  use  of  a  singular  value 
decomposition  of  an  extended  autocorrelation  matrix.  An  important  byproduct 
of  this  procedure  will  be  an  adaption  of  the  ARMA  modeling  procedure  of  the 
previous  section  which  provides  for  a  significant  improvement  in  spectral 
estimation  performance. 

When  the  ARMA  model  order  parameters  are  not  known  apriori,  it  will  be 
judicious  to  select  the  initial  model  order  to  be  much  larger  than  the 
'anticipated'  order.  In  particular,  let  us  consider  the  extended  order  ARMA 
(Pe»4e)  model  for  which  p«  is  selected  to  be  larger  (usually  much  larger) 
than  the  eventual  model  order  parameter  p  to  be  used.  Although  we  typieally 
do  not  know  p  apriori,  it  is  generally  possible  to  make  an  educated  guess  of 
p  so  as  to  ensure  that 

Pe  >  P  (7.1) 

In  accordance  with  expression  (2.23),  it  then  follows  that  the  tx(ps+l) 
extended  order  autocorrelation  matrix  associated  with  this  ARMA(p0,q0)  model 
may  be  expressed  as 

rx  ( qe+1 )  rx(qe)  ...  rx(Re“Pe+D 

rx(*e+2)  rxWe+l)  .  .  •  *xUe“Pe+2> 

®e  ■  .  .  .  (7.2) 

•  •  ’• 

•  •  '• 

_fx(«e+t)  rx(q#+t-l)  .  .  .  *x(qe'Pe+t) 

If  the  autocorrelation  lag  entries  used  in  this  matrix  correspond  to  an 
ARMA  (p,q)  process  for  which  qe-p4  2.  <TP»  it  then  follows  from  the  results  of 
section  II  that  the  rank  of  the  tx(pe+l)  matrix  Re  will  be  p.  In  arriving  at 
this  result,  we  of  course  assume  that  t  is  selected  to  at  least  equal  pv  To 
determine  the  required  order  parameter  p,  we  then  simply  set  p  equal  to  the 


rank  of  Re  for  the  idoaliatie  oaao  ia  which  exact  autocorrelation  lag 
inforaation  ia  awailable. 

To  obtain  the  A1MA  model's  (p+l)xl  antoregreasiwe  paraaatar  weotor  g 
froa  this  extended  order  autocorrelation  aatrix,  it  ia  poaaible  to  appeal  to 
the  theoretical  derelopaeata  of  Seotioa  II.  In  particular,  let  ua  eoaaider 
the  act  of  aubaatricea  of  1,  foraed  froa  any  p+1  contiguous  eoluana.  This 
set  of  tx(p+l)  aa trices  ia  apeeified  by 

Eg  “  [aubaatrix  of  E«  ooaposed  of  ita  h^  through 

P+kth  ooluan  weotor  a  inclusively]  for  l£Jkipe-p+l  (7.3) 

In  accordance  with  the  AREA  aodel  extended  Tule-Walker  equation 
relationships.  it  ia  readily  established  that  the  required  unique 
autoregressive  paraaeter  vector  £  will  satisfy  the  set  of  hoaogeneous 
relationships 

fry*  ■  e  for  lihiPe-^1  (7.4) 

where  the  first  ooaponent  of  £  is  constrained  to  be  one.  In  point  of  fact, 
expression  (7.4)  provides  a  aatrix  representation  for  the  t  extended 
Tule-Walker  equationa  (2.22)  defined  on  the  specific  indiees  qe+2-k  Ini. 
q*+t+l-k.  It  is  important  to  note  that  this  eonelusion  will  be  valid  only  if 
the  autocorrelation  lag  entries  used  in  foraing  R0  correspond  to  an  AREA 
(p.q)  process,  and.  the  order  paraaeter s  are  such  that  pe  2.  P  end  q(-*p(  > 
«TP. 

We  shall  now  apply  this  rank  characterization  of  Rs  to  the  practical 
problsa  in  which  the  AREA  aodcling  is  to  be  based  only  on  the  tiae  series 
observations 

x(l).  x(2) *  ....  x(N)  (7.3) 

and  not  on  actual  autocorrelation  lag  inf oraation.  In  this  ease,  it  will  be 
necessary  to  first  eoapute  autocorrelation  lag  estiaates  froa  these 
observations.  These  estiaates  are  next  substituted  into  the  aatrix  foraat 
(7.2)  to  in  turn  generate  the  extended  order  autocorrelation  aatrix  eatiaate 
Ra.  Since  the  autocorrelation  lag  estiaate  entries  will  be  invariably  in 
error,  it  follows  that  the  aatrix  Rt  will  noraally  have  full  rank  (i.e.,  ain 
(Pe+l,t))  even  when  the  tiae  series  under  study  corresponds  to  an  AREA  (p.q) 
process.  Nonetheless,  even  though  R,  will  have  full  rank,  its  'effective' 
rank  will  still  tend  to  be  p.  To  better  quantify  the  vague  tern  'effective' 


rank,  it  will  be  beaefieiel  to  introduce  the  principle  of  singular  value 
deoonposition. 

S insular  Value  Decoemofition 

In  n  ▼eriety  of  applications,  the  ultimate  objective  will  be  that  of 
solving  a  linear  system  of  equation*.  The  matrix  aaaoeiated  with  this  ay atom 
of  equations  not  only  oharaeterixea  the  deaired  eolation,  bat,  it  will  alao 
very  often  convey  dynamical  property  information.  With  thia  in  mind,  it 
often  behooves  na  to  examine  the  aalient  propertiea  of  thia  characterizing 
matrix.  The  aingnlar  value  decompoaition  of  a  matrix  aa  outlined  in  the 
following  theorem  serves  thia  role  particularly  well  (e.g.,  aee  ref.  [27]  and 
[39]). 

Theorem  7.1:  Let  i  be  a  mxn  matrix  of  generally 
complex  valued  el  event a.  Then  there  exiata  mxn 
and  a™  unitary  na  trice  a  II  and  V,  reapeetively, 
aueh  that1 

A  -  HZ  V*  (7.6) 

where  lit  a  mxn  matrix  whoae  elanenta  are  zero 
except  poaaibly  along  ita  main  diagonal.  Theae 
nonnegative  diagonal  elementa  are  ordered  aueh 
that 

°11  ^  w22  ^  ®hh  ^  0 

where 

h  "  min  (m,n)*. 

The  diagonal  elementa  ovv  are  commonly  referred  to  aa  the  aingnlar  valuea  of 
matrix  A.  It  is  well  known  that  the  nonzero  singular  valuea  will  correspond 
to  the  positive  square  roots  of  the  eigenvalues  of  the  nonnegative  Hermitian 
matrices  AA*  and  A*A.  Moreover,  the  eolunns  of  U  (or  V)  will  correspond  to 
the  appropriately  ordered  orthonormal  eigenvectors  of  the  nonnegative 
Hermitian  matrices 
AA*  (or  A* A) . 

The  singular  valuea  cvv  convey  valuable  information  concerning  the  rank 
characterization  of  matrix  A.  Thia  is  readily  demonstrated  upon  considering 


1The  matrices  V  and  V  are  said  to  be  unitary  if  IT1  -  U*  and 

y-1  «  y*. 
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the  problea  of  finding  that  in  aatrix  of  rank  k  which  will  boat  approxiaate 
A  in  the  Frobenions  non  sanaa  (this  aaanaaa  that  qyy  >  0  with  k  £  h) .  The 
Frobaniona  non  of  tha  axn  aatrix  diffaranea  A-B  ia  defined  to  be 


1 1 A  —  B 1 1 


1/2 


l*ij**ij  I 


2 


(7.7) 


Va  now  aaak  to  find  that  axn  rank  k  aatrix  B  which  will  render  this  criterion 
a  ainiana.  Tha  solution  to  this  approxiaati on  problea  is  contained  in  tha 
following  theorea  [27] 

Theoraa  7.2:  Tha  unique  axn  aatrix  of  rank  k  ± 

Bank  [A]  which  bast  approxiaates  the  sum  aatrix  A 
in  the  Frobenions  non  sense  is  given  by 

A(k)  -  U  3Lk  v*  (7.8) 

where  U  and  V  are  aa  in  expresaion  (7.0  while  2Tk 
is  obtained  froa  ^  by  setting  to  xero  all  but 
its  k  largest  singular  values.  The  quality  of 
this  optiana  approximation  is  given  by 


llA-  A(k)|| 


r  k  71/2 

5  «jj2  Oik£h 
J-k+1 


(7.9) 


The  degree  to  which  A^  approxiaates  A  is  seen  to  be  dependent  on  the 
sua  of  the  (h-k)  saallcst  singular  values  squared.  As  k  approaches  h.  this 
sua  will  becoae  progressively  saaller  and  will  eventually  go  to  xero  at  k  - 
h.  In  order  to  provide  a  convenient  neasure  for  this  behavior  which  does  not 
depend  on  the  sixe  of  aatrix  A,  let  us  consider  the  nonalixed  ratio 


llA(k)|| 

IIaIi 

-  [°U2  likih  (7.10) 

L*U*  +  ®222  +  •  •  •  ♦  ®hh2  J 

Clearly,  this  nonalixed  ratio  approaches  its  aaxiaua  value  of  one  as  k 
approaches  h.  For  aatrioes  of  low  effective  rank,  the  quantity  (k)  is  dose 
to  one  fox  values  of  k  significantly  saaller  than  h.  On  the  other  hand. 


/ 
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■a trices  for  which  k  aust  take  on  high  values  (i.e. ,  k  k)  to  achieve  a  (k) 
near  one  are  said  to  be  of  kifk  effective  rank. 

Aucllcatlon  of  SVP  to  ABMA 

To  deteraine  the  required  order  for  an  ABMA  aodel*  we  shall  now  aake  a 
SVD  of  the  tx(p0+l)  extended  order  autoeorrelation  aatrix  estiaate  t(l  that 
is 

*e  -  &2V*  (7.11) 

where  U  and  V  are  txt  and  (p0+l)x(p0+l)  unitary  aatrices*  respectively  and 
n«  »  tx(p,+l)  aatrix  of  the  fora  ealled  out  in  Theorsa  7.1.  The  required 
autoregressive  order  p  is  obtained  by  exaaining  the  noraalixed  ratio  M  (k). 
Naaely,  p  is  set  equal  to  the  saalleat  value  of  k  for  whioh  1(k)  is  deeaed 
'adequately*  olose  to  one.  The  terminology  'adequately  olose  to  one'  is 
subjeetive  and  will  depend  on  the  particular  application  under  consideration 
as  well  as  user  experience  gained  through  eapirical  experiaentation.  In  say 
case*  the  net  result  of  this  step  Mil  be  a  rank  p  optiaua  approxiaation  of 
the  tx(p0+l)  extended  order  autocorrelation  aatrix*  that  is 

*e<P>  ■  H  Zp  V*  (7.12) 

A  slaple  aatrix  aanipulation  reveals  that  this  rank  p  approxiaation  aay  be 
equivalently  represented  as 
P 

®e^^  m  ^  ®nn  2n  In*  (7.13) 

UFl 

where  and  u  are  the  k*k  ooluan  vectors  of  the  txt  and  (p0+l)x(p0+l) 
unitary  aatrices  U  and  V,  respectively.  Ve  shall  now  provide  two  separate 
procedures  for  using  this  rank  p  approxiaation  for  effecting  autoregressive 
paraaeter  estiaatea. 


Method  I:  ARMA  (p0»q0)  aodel 


In  this  approach*  the  rank  p  approxiaation  (7.12)  is  interpreted  as 
providing  an  iaproved  estiaate  of  the  underlying  extended  autocorrelation 
aatrir.1  It  will  be  convenient  to  deeoapose  this  rank  p  approxiaation  as 
follows 

*#(P)  .  [tl(P>  I  R0(P>]  (7.14) 
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where  rj(p)  i«  the  left  aost  tzl  coluan  vector  of  Re^P)  end  is  e  txpe 

astriz  coaposed  of  the  pe  right  most  tzl  colnan  vectors  of  R0(p).  Ve  sow 
seek  s  (pe+l)zl  sntoregressive  psrsaeter  vector  ±  with  first  coaponent  equal 
to  one  that  will  satisfy  the  theoreticsl  relstionship 

R#(P)  *  -  £ 

Since  the  rsnk  of  R0(p)  is  less  then  full,  there  will  ezist  sn  infinity  of 
solutions  to  this  problea.  Ve  shell  select  the  ainiaua  norm  solution  ss 
specified  by 


[Ra<P>]#  ^(p) 


in  which  the  superscript  notation  #  denotes  the  operation  of  generalized 
(psuedo)  aatriz  inversion.  This  autoregressive  parsaeter  selection  procedure 
has  proved  to  be  particularly  effective  in  low  SNR  environaents.  It  is 
readily  shown  that  this  ainiaua  non  solution  csn  be  siaplified  to 
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The  best  rank  p  approximation  matrix  (7.12)  contains  within  its  column 
structure  the  characteristics  required  to  estimate  autoregressive  parameters 
of  a  lower  order  ARMh(p,q)  model.  In  particular,  the  submatrices  of  RgCp) 
composed  of  its  columns  k  through  p+k  inclusively  yield  rank  p  approximations 
of  the  tx(p+l)  autocorrelation  matrices  Kfc  for  1  £  k  £ 
Pe  -  p+1  as  specified  by  expression  (7.3).  Ve  shall  denote  these  rank  p 
approximations  by  .  Due  to  the  SVD  operation  and  errors  inherent  in 

generating  Re,  there  will  generally  not  exist  a  unique  autoregressive 
parameter  vector  with  first  component  equal  to  one  which  will  satisfy  all  of 
the  p'-p+l  estimates  of  relationships  (7.4).  Nonetheless,  it  is  still 
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desirable  to  find  an  autoregressive  parameter  vector  for  which  each  of  these 
relationships  are  almost  satisfied.  A  functional  that  measures  the  degree  to 


which  this  is  accomplished  is  given  by 

ft*)  -  a*  S<P>A  (7.16a) 

where 

Pe“P+l 

S<P>  -  J  »k(p>*Rk(p)  (7.16b) 

k-1 

The  (p+l)x(p+l)  matrix  S^P)  is  nonnegative  Hermitian  and  may  be  conveniently 
computed  using  the  relationship 

P  Pe-P+I 

S(p>  -  J  }  Znk  ^nk*  (7.17) 

n-1  k-1 


in  which  v  fc  denotes  the  (p+l)xl  vector  as  specified  by 

Xnk  m  [vn(*) .vn(k+l) .  ....  vn(k+p)]'  (7.18) 

1  i  k  i  P#-P+1 
1  <  n  i  p 

This  vector  is  seen  to  be  a  windowed  segment  of  the  nth  column  vector  (i.e. , 
vn)  of  the  unitary  matrix  V  that  in  part  identifies  the  SVD  representation 
(7.11)-.  Moreover,  due  to  the  simple  shift  relationship  between  the  vectors 
v^  and  v*-1.  it  is  possible  to  devise  an  iterative  procedure  for  updating  the 
(p+l)x(p+l)  matrices  v^v^*  as  k  evolves.  This  will  entail  (p+1)  computations 
for  each  value  of  k. 

Upon  generating  the  (p+l)x(p+l)  matrix  S^P),  we  next  wish  to  select  that 
autoregressive  parameter  vector  a  with  first  component  of  one  so  as  to 
minimise  quadratic  functional  (7.16).  This  constrained  minimization  will 
result  in  the  best  least  square  approximation  of  the  theoretical 
relationships  (7.4)-.  Using  standard  procedures,  the  required  optimum 
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autoregressive  parameter  veotor  is  found  by  solving  the  following  linear 
system  of  eqaationsl 

-  o  A!  (7.19) 

in  which  the  normlizing  constant  a  is  seleoted  so  that  the  first  component  of 
.a  is  one.  It  will  be  shown  in  the  next  section  that  this  SVD  version  of  the 
ARMA  modeling  procedure  can  lead  to  a  significant  improvement  in  modeling 
performance. 

The  concept  of  a  SVD  representation  has  been  previously  incorporated 
with  success  in  effecting  AS  models  [42]  and  [61].  Incorporation  of  an  SVD 
AS  model  was  there  shown  to  produce  an  increase  in  spectral  resolution 
capabilities.  More  recently,  the  SVD  representation  was  used  in  ASHA 
modeling  where  impressive  results  were  reported  [22]-.  Undoubtably,  the  iaqiact 
which  SVD  will  ultimately  have  on  spectral  estimation  (and  in  other 
applications)  is  only  beginning  to  be  appreciated. 


,<P> 


lln  those  rare  eases  where  S'F'  is  singular,  the  required  autoregressive 
parameter  vector  is  set  equal  to  an. appropriately  normalised  eigenvector 
asaoeiated  with  a  zero  eigenvalue  of  Sip  . 
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VIII.  Numerical  Examples 

la  this  section,  we  shell  investigate  the  comparative  spectral 
estimation  performance  of  the  rational  amdeling  procedures  as  developed  in 
Seotions  VI  and  VII  with  those  of  popularly  need  alternatives.  The  first 
example  will  treat  the  problem  of  effecting  a  rational  spectral  estimate  from 
a  set  of  observations  of  an  ARMA(4,4)  process.  In  the  seoond  example,  we 
shall  examine  the  modeling  performance  for  the  special  case  of  sinusoids  in 
white  noise. 

Example  1:  In  this  example,  we  shall  examine  the  time  series  as 

characterized  by  (see  ref.  [Ill) 

x(n)  -  xj^n)  +  X2<m)  +0.5  e(n)  (8.1a) 

which  is  composed  of  the  two  AR(2)  time  series  generated  according  to 

*l(n)  ■  0.4  xi(tt-l)  -  0.93  xi(n-2)  +  ei(n)  (8.1b) 

X2(n)  ■  -0.5  X2(wl)  ~  0.93  X2(n^2)  +  «2(n)  (8.1c) 

where  s(n).  *i(n)  and  S2(n)  are  mutually  uncorrelated  Gaussian  zero  mean 
white  noise  processes  with  variance  one.  A  simply  analysis  indicates  that 
the  power  spectral  density  function  associated  with  time  series  (8.1)  is 
given  by 

S^m)  -  |l  -  0.4e-J«  +  0.93e-i*«l"2  + 

|l  +  0.5e-j«  +  0.93e“J2«r2  +  0.25  (8.2) 

and  is  plotted  in  Figure  7.1a. 

Using  the  time  series  description  (8.1).  twenty  statistically 

independent  realizations  each  of  length  125  were  next  generated.  These  20 
realizations  were  then  used  to  compare  the  modeling  effectiveness  of  this 
paper's  method  with  the  Box-Jenkin's  maximum-likelihood  method.  The  twenty 
(one  for  each  realization)  superimposed  ARMA  (4,4)  speotral  estimates 
obtained  using  the  Box-Jenkins  iterative  method  are  shown  in  Figure  8.1b. 
The  'somber  of  iterations  required  to  achieve  these  estimates  ranged  from  10 
to  700  with  50  being  a  typical  requirement.  Next,  this  paper's  method  as 

represented  by  expression  (6.6)  with  unbiased  autocorrelation  lag  estimates 

and  W  ■  I  was  used  to  obtain  the  ABMA  (4,4)  model's  autoregressive 

coefficients.  Relationship  (€.15)  with  the  window  selection  (6.17)  was  used 
in  forming  the  MA  component  of  the  speotral  estimates.  The  twenty 
superimposed  ARNA  (4,4)  speotral  estimates  thereby  obtained  are  shown  in 
Figures  8.1c,  8. Id,  and  8.1e  for  various  ohoioes  of  t.  From  these  plots,  it 


is  apparent  that  progressively  improved  estimates  are  achieved  upon 
increasing  t  from  its  minimal  value  of  4,  to  8,  and  then  to  20.  Moreover, 
these  speotral  estimates  were  of  higher  quality  than  those  obtained  with  the 
maximum-likelihood  method  vhieh  exhibited  a  larger  variance  in  estimate. 

Example  2:  In  this  example,  we  shall  investigate  the  comparative 

speotral  estimation  performances  of  various  widely  used  methods  on  the 
classical  sinusoids  in  additive  white  noise  problem.  The  particular  time 
series  to  be  considered  is  given  by 

x(n)  -  sin(2nfjn)  +  sin(2itf2n)  +  w(n),  1  i  n  £N  (8.3) 

fl  -  0.2,  f2  -  0.215,  <j\  -  0.5 

This  time  series  was  previously  examined  in  Section  III  where  different 
rational  models  were  generated  from  the  'exact'  antooorrelation  lags 
associated  with  it.  This  is  a  particularly  appropriate  time  series  for 
testing  the  resolution  capabilities  of  spectral  estimators  because  of  the 
closeness  of  the  sinusoidal  frequencies  (i.e. ,  f2-fj  -  0.015)  the 

prevailing  low  signal— to— noise  ratio  of  xero  dB  (individual  sinusoid  power  to 
total  noise  power). 

In  order  to  gain  a  reasonable  good  statistical  basis  for  comparison,  ten 
statistically  independent  realizations  of  the  time  series  (8.3)  were 
generated  with  each  realization  being  of  length  128  (i.e.,  N  -  128).  Using 
these  ten  different  sets  of  time  series  observations,  ten  speotral  estimates 
were  made  for  various  widely  used  rational  speotral  estimators.  The 
resultant  ten  spectral  estimates  for  each  estimator  were  then  plotted  in 
Figures  8.2  to  8.7  in  a  superimposed  fashion  (except  for  the  periodogram)  so 
as  to  depict  consistency  of  estimate.  The  ideal  estimate  would  of  course  be 
two  sharply  defined  peaks  at  frequencies  0.2  and  0.215.  A  brief  description 
of  the  different  estimators  and  their  performance  on  these  test  samples  is 
now  given. 

MA  Estimates:  The  periodogram  as  implemented  by  the  fast  Fourier 

transform  was  first  used  in  generating  spectral  estimates  for  eaoh  of  the  ten 
different  128  data  length  realizations.  Specifically,  expression  (4.7)  with 
N  -  128  was  incorporated  into  the  MA  speotral  estimator  (4.5)  to  generate  the 
sampled  periodogram  estimate 
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2 


0  i  k  i  N-l 


(8.4) 


N-l 

(•rj-i 

5  x(n+l)  e“j^~ 
n-0  N 

It  was  found  that  each  of  the  tan  periodogrsms  produced  renarkably  similar 
results.  A  typical  128  point  FFT  periodogran  for  one  of  these  trials  is 
shosrn  in  Figure  8.2a.  From  this  plot  (and  the  nine  others  not  shown),  it  was 
not  possible  to  nnaabiguously  detect  the  presence  of  two  spectral  peaks  at 
frequencies  0.2  and  0.215. 

In  order  to  ease  the  potential  aabiguity  created  by  the  finite  frequency 
sampling  of  the  periodogran  (i.e.,  Am  ”  2n/N) ,  the  concept  of  naddina  as 
described  in  Section  IV  was  next  incorporated.  Using  this  approach,  the 
original  tine  series  observation  of  length  128  was  next  appended  with  128 
zeroes.  The  resultant  258  point  padded  FFT  periodograa  is  shown  in  Figure 
8.2b.  In  this  padded  case,  we  are  able  to  unambiguously  detect  the  presence 
of  the  two  spectral  peaks  at  0.2  and  0.215.  A  further  padding  of  256  zeroes 
is  found  to  result  in  the  512  point  padded  FFT  periodogran  shown  in  Figure 
8.2c.  The  prerequisite  spectral  resolution  is  again  achieved. 

AS  Bstinates:  In  AE.  modeling,  the  most  widely  used  procedure  is  the 
Burg  algorithm.  With  this  in  mind,  the  Burg  algorithm  was  next  used  to 
generate  spectral  estiautes  for  each  of  the  aforementioned  ten  observation 
sets  of  length  128.  The  ten  superimposed  Burg  AR(20)  estimates  which 
resulted  are  depicted  in  Figure  8.3a.  Although  a  detection  of  spectral 
energy  in  the  region  about  f  -  0.2  is  evident,  the  appearance  of  two  spectral 
peaks  is  not.  The  ordering  selection  p*20  was  evidently  not  sufficient  for 
the  required  resolution.  Upon  increasing  the  AS  order  to  p  ■  24,  however, 
the  Burg  AR(24)  estimates  produced  two  reasonably  well  defined  peaks  about  f 
*0.2  and  f  “  0.215  in  nine  out  of  the  ten  estimates.  These  estimates  are 
plotted  in  superimposed  fashion  in  Figure  8.3b.  It  was  further  determined 
that  more  sharply  defined  peaks  are  achieved  in  all  ten  estimates  when  the 
order  was  increased  to  forty.  The  Burg  algorithm  is  then  seen  to  provide  a 
satisfactory  resolution  performance  for  the  time  series  under  study  provided 
that  the  AE  order  is  selected  to  be  on  the  order  of  24  or  more. 

In  order  to  demonstrate  the  effect  of  using  more  than  the  minimal  number 
of  extended  Yul e-Walker  equations  in  arriving  at  an  AE  model  (the  Burg 
algorithm  uses  the  minimal  number),  the  ASHA  modeling  technique  as  embodied 
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ill  expression  (6.6)  with  W  “  I  and  unbiased  autocorrelation  lag  estiaates  was 
next  need  with  p  -  20,  q  -  0,  and,  t  -  50.  The  resultant  ten  AK(20)  spectral 
estiaates  which  arose  when  using  this  approach  are  shown  in  Fignre  8.3c.  A 
resolution  of  the  two  sinnsoids  was  achieved  in  all  ten  estiaates.  It  is 
significant  that  the  lover  order  AK(20)  speotral  estiaates  as  generated  using 
this  paper's  aethod  provided  a ore  sharply  defined  peaks  than  the  higher  order 
Burg  AK(24)  spectral  estiaates.  This  is  priaarily  due  to  the  fact  that  fifty 
extended  Yule-Walker  equations  were  used  in  specifying  the  20  autoregressive 
paraaeters.  The  degree  of  saoothing  achieved  in  applying  this  approach  is 
evident  froa  this  nuaerieal  exaaple. 

AKMA  Bstiaates:  The  ARMA  aodeling  procedure  as  represented  by 

expression  (6.6)  with  f  ■  I  and  unbiased  autocorrelation  lag  estiaate  entries 
vas  next  used  to  generate  estiaates  of  the  autoregressive  coefficients  of  an 
ARl(A(p,p)  aodel  for  p  ■  8  and  12.  In  accordance  with  the  results  of  Section 
III,  plots  of  lAp(ej*)l~2  were  then  aade  so  as  to  reveal  the  required 

speotral  inforaation  for  the  sinusoids  in  white  noise  case  (i.e.,  the  zeroes 
are  not  used).  In  Figure  8.4a,  the  ten  AK(8,8)  speotral  estiaates  which 
arose  for  a  ohoioe  of  t  ■  70  are  shown  superiaposed.  Although  spectral 
energy  in  the  neighborhood  of  f  -  0.2  is  detected,  the  presence  of  the 
required  two  speotral  peaks  is  not.  Clearly,  the  order  selection  p»8  vas  not 
sufficient  to  achieve  the  desired  resolution.  Upon  increasing  the  order  to 
AKMA  (12,12)  and  retaining  t  -  70,  however,  the  resultant  ten  spectral 

estiaates  shown  in  Figure  8.4b  each  achieved  the  desired  spectral  resolution 
with  two  sharply  defined  peaks  about  f  ■  0.2  and  f  ■  0.215.  These  spectral 
estiaates  have  been  obtained  with  but  twelve  autoregressive  paraaeters,  and, 
are  seen  to  be  sianificantlv  superior  to  the  Burg  AR(24)  estiaates  vhieh 
required  twenty-four  autoregressive  paraaeters.  In  teras  of  spectral 

estiaation  fidelity  and  parsaeter  parsiaony  (i.e.,  effective  use  of 

paraaeters),  it  is  dear  that  the  AKMA  aodeling  aethod  herein  developed  has 
provided  a  superior  perforaance  for  the  problea  at  hand. 

A  truly  significant  increase  in  spectral  estiaation  perforaanoe  is 
achieved  upon  adopting  the  SVD  approaches  to  AKMA  aodeling  as  outlined  in 
Section  VII.  Naaely,  after  setting  P*  ■  q9  •  14  and  t  ■*  50,  it  was  found 
that  the  effective  rank  of  the  extended  order  autocorrelation  aatrix  estiaate 
Ke  was  four.  Setting  pM  and  using  relationship  (7.15),  the  tea  AKMA(14,14) 
speotral  estiaates  whioh  arose  are  shown  in  Figure  8.4e.  Next,  letting  p“4 
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in  aggression  (7.17),  the  ten  SVD  derived  ARMA(4,4)  epeetrel  eatinetes  vhioh 
arose  are  shown  snperiaposed  in  Figure  8.4c.  These  speotral  estimates  are 
not  only  of  uniformly  high  quality,  but,  they  represent  the  lowest  order 
rational  model  whioh  is  compa table  with  the  two  sinusoids  in  white  noise 
ease.  Moreover,  the  quality  of  the  peak  frequency  estimates  and  associated 
pole  magnitude  (theoretically  equal  to  one)  estimates  is  exceptional  as  shown 
in  Table  8.1.  The  quantities  ffc(pfc)  and  (®g  2)  for  h  *■  1,2  represent 

the  ssmpled  means  and  variances,  respectively,  of  the  peak  frequencies  (pole 
magnitudes)  as  determined  from  the  ten  speotral  estimates. 


k 

*k 

*k 

ofk 

Ipki 

aipkl 

1 

0.20 

0.1998 

0.0012 

0.9944 

0.0062 

2 

0.215 

0.2159 

0.0011 

0.9974 

0.0080 

Table  8.1:  Statistics  of  SVD  ARMA  (4,4)  estimates. 

To  demonstrate  the  worth  of  singular  values  in  model  order  determination 
when  using  the  SVfi  approach,  the  fifteen  singular  values  which  characterised 
the  extended  order  autocorrelation  matrix  estimate  Re  for  one  of  the  ten 
observation  sets  are  now  given 

*  18.3  ,  m  18.2  ,  j  ■  5.30  ,  *  4.89 

*  0.85  ,  m  0.78  ,  .  .  .  ,  25  m  0.21 

It  is  apparent  that  the  first  four  singular  values  are  dominant  (i.e. ,  (4)  ” 

0.995)  thereby  indicating  that  the  effective  rank  of  Re  is  four.  Thus,  the 
correct  selection  of  ARNA  order  p  ■  q  ■  4  is  made  upon  examination  of  the 
singular  values  behavior.- 

Alternative  Method:  In  Section  III,  an  alternate  method  for  detecting  and 
estimating  the  frequencies  of  sinusoids  in  white  noise  was  proposed.  This 


■etliod  is  represented  bp  the  least  squares  solution  (3.9)  to  an  overdeterained 
system  of  linear  equations.  Using  this  expression  with  a  seleotion  of  p  ■  14, 
t  ■  30,  I  -  I,  and,  unbiased  autocorrelation  lag  estimate  entries  for  Kt, 
spectral  estimates  for  eaeh  of  the  ten  tine  series  of  length  128  were 
generated.  The  results  of  these  estiaates  are  depioted  in  superiaposed  style 
in  Figure  8.S  in  plots  of  1/ U^feJ*) I2.  Two  sharply  resolved  peaks  are 

achieved  in  eaoh  of  the  ten  estiaates.  It  is  noteworthy  that  this  procedure 

provided  good  estiaates  for  a  low  order  choice.  A  paper  now  in  preparation 
will  demonstrate  the  exceptional  performance  of  this  new  procedure  for  a  aore 
general  class  of  deterministic  signals  in  white  noise.  Improvement  is  there 
achieved  by  Baking  an  estimate  of  the  white  noise  variance  e*  using  expression 

(3.2)  at  H),  then  subtracting  this  estiaate  froa  the  rz(0)  tens  and  then  using 

an  SVD.  Initial  eapirieal  evidence  suggests  that  this  new  approach  provides 
significantly  better  perforaanee  than  the  Pisarenko  method  [55]  and  it 
variants,  snd,  the  Cuaaresan-Tuf ts  approach  [42], [61]. 

Comparison  with  the  Kuaaresan-Tufts  Method 

We  shall  now  consider  a  time  series  of  fora  (8.3)  in  which  the  relevant 
paraaeters  are  given  by 

*1  -  0.2,  f 2  ■  0.21,  Op2  -  1.778 

This  particular  parameter  choice  provides  a  aore  challenging  test  of  resolution 
capability  in  that  the  frequency  spacing  f2~fi  "  0.01  is  smaller  and  the  SNR  of 
-5dB  is  lower  than  that  of  tiae  series  (8.3).  Again  ten  statistically  sample 
runs  each  of  length  128  were  used  for  testing  four  AR  type  aodels.  In  the 
first  AR  model,  expression  (7.2)  with  choices  of  qv  *  -1,  pe  -  35,  t  ■  90 
(giving  90  TV  equation  approximations)  were  aadev  Unbiased  autocorrelation 
estiaates  were  then  used  to  fora  the  90  x  36  aatrix  estiaate  R(,  Finally,  the 
optiaoa  autoregressive  paraaeter  estiaates  were  generated  upon  using  expression 
(7.15).  The  resultant  ten  AR(3S)  spectral  estiaates  are  shown  in  superiaposed 
plots  in  Fig.  8.6a  where  resolution  was  achieved  in  eaeh  of  the  ten  runs. 
Next,  the  extended  autocorrelation  aatrix  aodel  (7.2)  with  qg  *  -1,  pg  «*  96,  t 
-  96,  and  unbiased  autocorrelation  lags  was  tested.  Expression  (7.15)  with  p  * 
4  was  then  used  to  generate  the  a^  estiaates  of  the  AR(96)  aodel.  A  plot  of 
the  resultant  spectra  is  shown  in  Fig.  8.6b  where  resolution  was  achieved  for 
eaeh  of  the  tea  rune. 
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Th«  KaaaresaarTuf ts  itthod,  thiek  provide*  *  near  naxiuua-li-'^ihood 
perforaanoe,  ns  next  tested  on  these  seae  tea  saaple  raas  [61]  .  The  reealtaat 
AS  type  35*k  aad  96**  (the  optiaaa  KT  order  ohoioe)  order  spectra  sre  shova 
plotted  ia  Figs.  8.6c  aad  8.6d,  respectively.  The  35th  order  aodel  vss  enable 
to  resolve  the  siaasoids  ia  say  of  the  tea  raas  vhile  the  96*k  order  aodel 
achieved  a  resolatioa  ia  each  case.  For  this  exaaple,  it  is  appareat  that  the 
overexteaded  aodeliag  approach  advocated  ia  this  paper  has  oatperforaed  the 
pseado  aaxiaaa  likelihood  Kaaaresan- Tofts  aethod.  Moreover*  the  ooapatatioaal 
effieieacy  of  this  paper's  overexteaded  aodeliag  aethod  (7-.15)  is  far  saperior 
as  vill  be  dooaaeated  ia  a  forthcoaiag  paper. 

Adaptive  ASMA.  Mode lias 

As  a  final  exaaple.  the  adaptive  ASMA  aodeliag  procedure  to  be  developed 
in  Section  X  vss  applied  to  the  tiae  series  (8.3)  in  vhich  the  covariance  aode 

(kl  “40.  k2  “  1)  was  selected  with  ASMA  order  p>12.  The  spectral  estiaates  of 
five  independent  raas  at  data  lengths  N  “  128.  N  ■  256  and  N  ■  1024  c-re  shova 
superiaposed  in  Figure  8.7.  Fran  these  plots,  it  is  apparent  that  the  tvelfth 
order  ASMA  aodel  detects  the  presence  of  spectral  energy  in  the  neighborhood  of 
f  “  0.2  at  data  length  N  *  128.  bat,  the  resolatioa  of  two  spectral  peaks  is 
soaevhat  unsatisfactory.  As  the  ASMA  aodel  adapts  to  the  data,  however,  two 
veil  defined  spectral  peaks  appear  at  N  “  256.  The  aodel  has  therefore  adapted 
to  the  signal  using  less  than  256  tiae  series  observation*. 

To  illustrate  the  perforaaaee  of  this  adaptive  ASMA  approach  relative  to 
popularly  used  aethods,  the  classical  adaptive  AK  covarianoe  aethod  to  be 
developed  in  section  IX  vas  next  used  oa  the  sane  set  of  tiae  series 
observations^  The  five  speotral  plots  vhioh  arose  for  an  AR(22)  aodel  are 
shova  superiaposed  in  Figure  8.8  at  N  “  128,  N  -  256,  and,  N  -  1024.  Clearly, 
the  higher  order  oovariaaoe  AK  aodel  vas  aaable  to  satisfactorily  resolve  the 
tvo  sinusoids  even  at  data  leagth  N-1024 .  Thus,  the  lover  order  ARMA  (12,12) 
covarianoe  adaptive  aodel  significantly  oatperforaed  the  higher  order  AR(22) 
oovariaaoe  adaptive  aodel.  This  is  indeed  noteworthy  when  it  is  realised  that 
some  of  the  aore  widely  used  adaptive  filters  utilise  the  AK  covarianoe  aodel. 
This  includes  the  fast  LMS  algoritha  of  Norf  [25]  ,[46] ,  [47]  aad  the 
approxi anting  gradient  approach  of  Vidrow  [65] . 
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Fig.  8.1  ARMA  spectral  estimates  of  order  (4,4).  (a)Exact.  (b)  Box- 
Jenklns  maxi mum- like 11 hood  method,  (c)  Paper's  method  for  t«4. 

(d)  Paper's  methpd  for  ta8.  (e)  Paper's  Method  for  ta20. 
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(a)  N  =  128 
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Fig.  8.2  Moving  average  (MA)  spectral 
estimates  using  the  FFT 
implementation  of  the  perlodogram 
with  128  time  series  observations 
(a)  no  zero  padding,  (b)  128  zero 
padding,  (c)  384  zero  padding. 
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Fig.  8.3  Autoregressive  (AR)  spectral  estimates 
from  128  time  series  observations 
(a)  p  ■  20  Burg  estimate,  (b)  p  -  24 
Burg  estimate  (c)  This  paper's  method 
(6.6)  with  q  «  0,  p  -  20,  t  «  56 
estimate  using  expression  (6.6). 
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Fig.  8.7  Adaptive  ARf 
estimates  w 
(a)  N  =  128 
(c)  N  =  102- 
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Fig.  8.8  Adaptive  AR(12)  spectral  estimates 
using  the  covariance  method. 

(a)  N  =  128,  (b)  N  -  256, 

(c)  N  =  1024. 
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IX.  AR  Modeling:  Adaptive  Implementation 

In  section  V,  a  general  procedure  for  effecting  an  AR  nodal  which 
represents  a  set  of  given  tine  series  observations 

x(l),  x(2) *  ....  x(N)  (9.1) 

was  presented.  It  was  there  shown  that  the  required  modeling  entailed  using 
these  tine  series  observations  to  genarate  estimates  for  the  entries  of  the 
(p+l)x(p+l)  AR  antooorrelation  matrix  as  specified  by 

R(i.j)  -  rx(i-j)  1  i  i,j  i  p+1  (9.2) 

From  a  performance  viewpoint,  the  unbiased  antooorrelation  estimate  (5.4)  was 
suggested  as  a  logical  choice  for  estimating  these  entries.  In  ary  case, 
once  estimates  for  the  R(i.j)  elements  have  been  made,  the  required  AR(p) 
model  parameters  are  obtained  by  solving  the  linear  system  of  equations 

*  A  -  lbDl2  ®i  (9.3) 

where  it  will  be  recalled  that  the  parameter  bQ  is  chosen  so  that  the  first 
component  of  £  is  one. 

In  applications  requiring  a  continuous  updating  of  the  AR  model 
parameters  as  new  time  series  observations  become  available  (i.e..  x(N+l), 
x(N+2) .  ...).  however,  the  standard  unbiased  estimator  approaoh  poses  a 

serious  computational  burden.-  To  overcome  this  difficulty,  it  behooves  us  to 
seek  alternate  autocorrelation  estimators  which  are  more  ameanable  to  an 
adaptive  solution.  With  this  objective  in  mind,  we  shall  now  consider  the 
adaptive  class  of  autocorrelation  estimators  as  defined  by 

N+k2-l 

R(i.j)  m  1  5  x( k+l-i ) x (k+1- j )  liiip+1  (9.4) 

N+k2-*l  k.kl  lUlP+1 

in  whioh  the  convention  of  setting  to  zero  any  summand  terms  x(n)  whose 
argument  lies  outside  the  observation  set  l£n$(  has  again  been  adopted. 
Although  this  expression  might  initially  appear  to  be  unduly  contrived,  it 
does  provide  us  with  an  autocorrelation  estimate  of  rz(i-j)  as  called  out  for 
in  expression  (9.2).  More  importantly,  however,  this  estimator  will  be 
shortly  shown  to  have  a  most  convenient  matrix  product  representation. 

The  integer  constants  k2  and  k2  whioh  characterize  the  autocorrelation 
lag  estimator  rule  (9.4)  are  to  be  selected  so  that  the  number  of  lag 
products  there  used  (i.a. ,  N+k2-k2)  at  least  equals  p+1.  This  requirement 
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will  generally  ensue  tie  invertibility  of  the  sntooorrelstion  nstriz 

A 

estiaste  R  associated  with  the  estinstes  (9.4).  In  most  onset  of  interest, 
these  constants  are  further  confined  to  the  range  l<ti .ko<p+l  although  other 
ohoioes  are  penaissable.  It  then  follows  that  each  neaber  of  the  adaptive 
autocorrelation  estiaator  class  will  be  identified  by  a  specific  two-tnple 
(kl,k2)**  Moreover,  each  such  estiaator  will  provide  a  generally  different 
set  of  autocorrelation  estiutes  froa  the  given  set  of  tine  series 
observations  (9.1)'. 

Meabers  of  the  adaptive  class  of  autocorrelation  estiaators  have  a 
particularly  convenient  algebraic  representation  whioh  we  shall  caploy  when 
effecting  the  proaised  adaptive  iapleaentation.  Specifically,  the 
(p+l)x(p+l)  autocorrelation  aatrix  estimate  that  arises  upon  using  the 
estinstes  (9.4)  as  entries  can  be  always  expressed  in  the  following  data 
aatrix  product  format 

R  -  - i —  (9.5) 

N+k2-ki 

in  which  %  is  the  (N+k2~ki)x(p+l)  data  matrix  whose  individual  eleaents  are 
specified  by 

^N(i.j)  -  x(ki+i-j)  liiSI+k2-ki  (9.6) 

lijip+1 

We  have  here  appended  the  subscript  N  to  the  data  aatrix  so  as  to  explicitly 
recognize  its  dependency  on  the  data  length.  The  incorporation  of  this 
subscript  will  be  also  useful  when  obtaining  the  proaised  adaptive 
iapleaentation.  A  straightforward  analysis  will  deaonstrate  the  equivalency 
of  expr.ssions  (9.4)  and  (9.5).  The  data  aatrix  is  seen  to  have  eleaents 
whose  entries  are  the  given  tiae  series  observations  (9.1)  as  well  as  zeroes 
whioh  appear  whenever  the  tiae  index  arguaent  (kj+i-j)  foils  outside  the 
observation  set  lj[n£t* 

It  is  possible  to  provide  a  revealing  visual  interpretation  to  the 
concept  of  data  windowing  for  this  olass  of  estiaators.  In  particular,  let 
us  consider  the  following  (N+p)x(p+l)  kernel  Toeplitz  type  aatrix  whioh 


lAa  we  will  shortly  see,  the  fou  aost  widely  used  aeabers  of  this  olass  are 
the  covarianoe  aethod  (k2»p+l,  k2«l),  the  autocorrelation  aethod  (k2"l, 
k2«p+l) ,  the  prewindow  aethod  (k2«l,  k2**l) ,  and,  the  postwindow  aethod 

(kj-p+1 ,k2»p+l) . 


69 


contains,  as  submatrioes,  all  of  the  data  aatriees  assooiated  with  the 
adaptive  olass  of  autocorrelation  estimators. 


x(l) 


'.O 


x(p+l) . x(l) 


x(N) 


.  z 


(N-p) 


kl  -  1 


Prewindowing 
-  kl  “  P+1 

- k2  -  1 


.  x(N> 


Postwindowing 
-  k2  -  p+1 


(9.7) 


Upon  e ism i nation  of  the  data  matrix  definition  (9.6),  it  is  apparent  that  % 
map  be  identified  with  that  submatrix  of  ^  composed  of  its  h2st  through 
(N+k2-l)st  rows,  inclusively.  Thus,  corresponding  to  each  adaptive 
autocorrelation  estimator  (i.e. ,  pair  (k2,k2)),  we  may  obtain  the  assooiated 
data  matrix  using  this  row  identification  scheme. 

The  zeroes  which  appear  in  the  upper  right  corner  of  the  kernel  matrix 
14  are  there  due  to  the  implioit  assumption  that  x(n)  *  0  for  -p+l£n£0.  This 
rather  unrealistic  assumption  concerning  an  unobserved  segment  of  the  time 
series  is  commonly  referred  to  as  a  prewindowing  of  the  data.  It  is  seen 
that  a  degree  of  prewindowing  is  incorporated  whenever  the  constant  k2  is 
seleoted  suoh  that  lih^ <p.  Normally,  suoh  ohoioes  are  to  be  avoided  sinoe 
they  will  generally  lead  to  relatively  poor  AR  modeling  due  to  the 
unrealistic  prewindow  assumption  thereby  being  made  on  the  time  series.  As 
kj  ranges  over  the  integers  1  to  p+1,  the  degree  of  prewindowing  incorporated 
varies  from  full  at  kj»l  to  none  at  kj»p+l.  This  prewindowing  behavior  is 
conveniently  depioted  in  expression  (9.7) . 

In  a  similar  fashion,  the  zeroes  whieh  appear  in  the  lower  left  corner 
of  matrix  are  there  due  to  the  implicit  oostwindow  assumption  that  x(n)  ■  0 
for  N+l£nfli+p.  This  equally  unrealistic  assumption  concerning  an  unobserved 
segment  of  the  time  series  is  to  be  generally  avoided.  A  degree  of 
postwindowing  is  incorporated  whenever  the  index  k2  is  ohosen  to  lie  in  the 
range  2^Jt2£p+i.  The  smallest  value  of  k2  for  whioh  the  postwindow  assmption 
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is  Avoided  is  seen  to  bo  k2«l.  Thus,  os  the  index  k2  ranges  from  1  to  p+1, 
the  degree  of  postwindowing  incorporated  varies  from  none  at  k2«l  to  full  at 
kj-p+lv 

The  four  aost  widely  used  of  the  adaptive  autocorrelation  eatiaator 
methods  axe  listed  in  Table  9.1  (e.g.,  see  refs.  [25]  ,146]  ,147]) .  Each  of 
the  aethods  there  shown  are  seen  to  entail  combinations  of  aaxisna  windowing 
and  no  windowing.  In  the  covariance  aethod,  the  charaoteristio  constants  are 
chosen  to  be  k^  »  p+1  and  k2“l.  This  particular  choice  is  seen  to  provide 
the  laraest  nuaber  of  lag  produots  in  the  autocorrelation  eatiaates  (9.4) 
over  which  no  data  windowing  is  involved.  As  aight  be  expected,  the 
covariance  aethod  generally  provides  the  best  AS  aodeling  and  apeetral 
resolution  pexforaanee  when  compared  with  the  remaining  a  cabers  of  the 
adaptive  autocorrelation  estiswtor  class.  With  this  in  aind.  unless  special 
considerations  dictate  otherwise,  the  covariance  method  is  the  aost 
preferable  choice  f o*  an  adaptive  iaplenentation. 

In  the  three  remaining  aethods  listed  in  Table  9.1.  it  is  seen  that  a 
aaxiaua  aaount  of  prewindowing.  postwindowing.  or.  both  are  being  eaployedr.- 
It  is  then  not  surprising  that  each  of  these  methods  will  generally  provide 
relatively  poor  modeling  pexforaanee.  This  will  be  particularly  true  for 
data  lengths  N  which  are  not  significantly  larger  than  the  AR  order  parameter 
p.  As  the  data  length  N  increases  so  that  N»p.  however,  each  of  the  four 
aethods  will  provide  coaparable  aodeling  perfoxaance.  This  is  due  to  the 
faot  that  the  windowed  portions  of  the  data  aatrix  will  play  a 

A 

proportionately  saaller  role  in  the  estiaate  R  as  N  increase*.  An 
appreciation  for  this  behavior  is  readily  obtained  upon  exaaination  of  the 
kernel  aatrix  (9.7). 

As  suggested  earlier,  the  priaary  reason  fox  preferring  the  adaptive 
autocorrelation  eatiaator  (9.4)  over  the  standard  unbiased  estimator  (5.4)  is 
that  the  foraer  aay  be  used  to  effeot  a  computationally  efficient  adaptive  AR 
aodeling  aethod.  To  gain  an  inaight  as  to  why  this  is  so,  let  us  first 
substitute  the  autocorrelation  aatrix  estiaate  (9.5)  into  the  fundamental  AR 
aodeling  expression  (9.3).  The  required  paraaeters  of  the  AR(p)  model  are 
then  found  by  solving  the  resultant  system  of  normal  equations 

■  (N+k2“kj_)  |b0|*  (9.S) 

in  which  the  noraalixing  paraaeter  b0  is  to  be  seleoted  so  that  the  first 
ooaponent  of  is  one.  The  data  aatrix  produot  in  this  expression  is 
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METHOD 

CONSTANT 

kl 

CONSTANT 

k2 

STATISTICAL 
PROPERTIES 
OF  R 

1.  Covariance 
(No  windowing) 

P+1 

1 

(i)  unbiased 
(ii)  consistent 

2.  Full  Prewindowing 
No  Postvindowing 

1 

1 

(i)  biased 
(ii)  consistent 

3.  Full  Postwindowing 
No  Prewindowing 

P+1 

P+1 

(i)  biased 
(ii)  consistent 

4.  Autocorrelation 
(Full  pre  and 
postwindowing) 

1 

P+1 

(i)  biased 
(ii)  consistent 
( iii)  Toeplitz 

Table  9.1  Adaptive  AR  Autocorrelation  Estimation  Methods 


seem  to  completely  characterize  the  desired  autoregressive  parameter  vector 
Sfj  associated  with  the  N  time  series  observations  (9.1) . 

As  the  tine  index  N  is  incremented  by  one  (i.e.,  the  (N+l)*fc  time  series 
observation  x(N+l)  becomes  available),  it  ia  seen  that  a  new  system  of  normal 
equations  of  form  (9.8)  vill  arise  in  which  the  index  N  is  replaced  by  N+l. 
The  resultant  data  matrix  product  Xn+1*iN+1  which  characterises  this  new 
system  of  equations  will  in  turn  give  rise  to  the  updated  autoregressive 
parameter  vector  jty+i.  Ve  oan  continue  this  systematic  procedure  to  generate 
the  updated  autoregreasive  parameter  vectors  1n+2»  jn+j ,  etc.  as  the  new  time 
series  observations  x(N+2),  x(N+3),  etc.  become  available.  The  ability  to 
evolve  an  adaptive  solution  procedure  when  using  this  approach  vill  be  then 
dependent  on  our  obtaining  an  effective  method  for  updating  the  data  matrix 
products  2n*%  **  N  evolves. 

Adaptive  Alaorlthm;  kj_«  1 

The  adaptive  expression  relating  the  successive  data  matrix  produots 
will  be  considerably  eased  if  the  constant  1&2  is  selected  so  as  to  provide 
either  no  or  full  postvindowing.  To  illustrate  this  point  let  us  first 


72 


exaaine  the  cui  of  nonpostwiadowins  for  whioh  k2«i  while  allowing  hi  to  take 
on  any  value  in  [l,p+l] .  From  examination  of  the  defining  expreaaion  (9- .6) , 
it  ie  seen  that  the  data  aatrix  X^+l  may  be  obtained  by  appending  a  row 
vector  to  the  bottom  of  data  aatrix  This  results  in  the  following 

reonrsion  on  the  data  aatrix  prodnots 

%+l*%+l  “  XjJ  \  +  «+12k+i  N  1  p+1  (9.9) 

in  which  is  the  above  mentioned  appended  lx(p+l)  row  vector 

•  [x(N+l) ,  x(N),  ...,  x(N+l-p)]  (9.10) 

It  is  important  to  note  that  this  data  aatrix  product  recursive  expression 
coaaenoes  at  N«p+1  which  corresponds  to  the  first  tine  index  at  which  X^X^ 
has  its  full  fora.  Thus,  the  aatrix  Xp+iXp+l  serves  the  role  of  initializing 
the  above  recursive  relationship.  The  eleaents  of  this  initializing  aatrix 
are  obtained  froa  expression  (9.4)  upon  setting  k2*i  and  N-p+1 ,  that  is 

P+1 

Xp+1  Xp+l(i.j)  -  J  x(k+l-i)  x(k+l-j)  1  i  i  i  p+1  (9.11) 

«1  1  i  j  i  P+1 

It  is  interesting  to  note  that  although  each  aember  of  the  nonpostwindow 
class  (as  identified  by  k2«i  and  kie[l,p+ll)  will  be  governed  by  the  same 
recursion  (9.9),  they  will  each  give  rise  to  a  generally  different  set  of 
autocorrelation  estiaates.  This  is  due  to  the  fact  that  the  initializing 
aatrix  (9.11)  will  be  generally  different  for  various  choices  of  k2. 

Froa  recursive  expression  (9.9),  it  is  seen  that  successive  data  aatrix 
produots  differ  by  the  rank  one  aatrix  x^+^  This  simple 

interrelationship  will  in  turn  enable  us  to  obtain  a  recursive  expression  for 
the  data  aatrix  product  inverses  [Xg  Xn]~1.  Ve  are  interested  in  these 
inverses  sinoe  they  will  be  ultimately  used  when  solving  expression  (9.8)  for 
the  AR  aodel  paraaeters.  This  required  aatrix  inverse  recursion  will  aake 
use  of  expression  (9.9)  and  the  following  well  known  leaaa 

L causa  9.1:  Let  A  and  A+  u*  y  each  be  nonsingular  sxs  aatriees  where  y 
and  x.  are  lxs  vectors,  then 

(A  +  a*  y]"1  -  A-1  -  tA  1  (9.12) 

(1  +  y  A"*  y*) 
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Upon  setting  A  ■  %  end  a  ■  z  ■  1N+1  this  leans,  the  required  recursive 

nstrix  inverse  expression  is  found  to  be 

e 

*  .  *  ,  TN+1  T»i+i 

[%+l  iN+ll"1  -  [%  In]-1  -  _ I _ Z___  for  H  2  p+kx 

1  +  ™+1  *N+1 

where  (9.13) 

yjl+1  * 

In  using  this  nstrix  inverse  recursion,  it  is  inportsnt  to  note  thst  it  is 
only  sppliosble  for  tine  indices  N  2.  P+h]..  This  is  s  direct  consequence  of 
the  fset  thst  the  dsts  nstrix  products  X^  X^  sre  singulsr  for  sll  tine 
indices  N  <  p+kj  in  the  nonpostwindowing  esse  k2*l.  To  use  this  recursive 
spproech,  it  is  therefore  neeesssry  to  first  ooapute  the  initislixing  nstrix 
inverse  [X^  %]”*  for  N  *  p+k^  using  s  stsndsrd  nstrix  inversion  routine  such 
ss  Gsussisn  elininstion.  Subsequent  nstrix  inverses  for  N  >  p+k^  nsy  be  then 
efficiently  obtsined  upon  using  recursion  fomuls  (9.13)  . 

To  conplete  the  sdsptive  AR  nodeling  procedure,  we  next  incorporste  the 
dsts  nstrix  product  inverse  routine  (9.13)  into  the  AR  nodeling  equstions 
(9.8) .  A  little  thought  will  oonvinoe  oneself  thst  the  sinple  three  step 
procedure  outlined  in  Tsble  9.2  will  provide  the  required  sdsptive 
sutoregressive  psrsaeter  vector  procedure.  The  second  step  is  seen  to  yield 
the  unnornslized  solution  to  expression  (9.8)  with  N  replsoed  by  N+l  while 
the  third  step  ensures  thst  the  first  coupons nt  of  ajj  is  one.  In  terns  of 
conputstionsl  conplexity,  sn  exaninstion  of  equstion  (9.13)  indicstes  thst 

e 

2(p+l)  operstions  will  be  required  for  updsting  [2$%]  .  The  resultsnt 

sutoregressive  psrsneter  vector  solution  ss  represented  by  steps  2  end  3  of 
Tsble  9.2  will  require  sn  sdditionsl  (p+1)  operstions.  Thus,  the 
conputstionsl  conplexity  of  the  nonpostwindowing  sdsptive  slgorithn  (i.e», 
kj-l)  is  then  o(p2).  This  slgorithnic  approach  is  sppliosble  for  any 
selection  of  the  constsnt  kx  with  the  nost  likely  choioes  being  fron  the 
rsnge  [l,p+l] .  The  nost  useful  iapleaentstion  of  this  sdsptive  slgorithn 
corresponds  to  the  selection  kj-p+1.  In  this  esse,  the  covsrisnoe  nethod  ss 
specified  by  k^-p+1 ,  k2“l  is  obtsined.  As  pointed  out  previously,  this 
choice  noraslly  provides  the  best  sdsptive  AR  nodeling  perfornsnoe  behsvior. 
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Step  0: 

Input  Data:  x(N+l),  [X^X^]”1 

Step  1: 

Coapute  [X*+1  xn+1 ] -1  using  recursion  (5.14) 

Step  2: 

L,t  4  ■  ‘CiW1  a. 

Step  3: 

AN+1  ■  o(l)”1  £  where  c(l)  ia  the  first  coaponent  of  £. 

Table  9.2:  Adaptive  AR  Modeling  Algoritha-Co variance  Methoda 
(No>2p+l)  and  Previndov  (No- p+1)  nethoda . 


Adaptive  Algor itlua  k2  *  p+1 

Vaing  ainilar  reasoning,  it  ia  also  possible  to  evolve  an  efficient 
adaptive  algorithm  for  the  full  prewindowing  oaae  kj  ■  p+1 .  In  this 
situation,  it  ia  readily  found  that  the  data  aatrix  products  are  recursively 
related  scot  .-ding  to 

*fl+lXN+l  -  +  DN+1  Nip+1  (9.14) 


in  which  is  a  (p+l)x(p+l)  Toeplitx  conjugate  syaaetrie  aatrix  with 
eleaents 


**<i.j> 


X(N+1)  x(N+l+i-j) 
x(N+l)  X(N+l+i-j) 


i  i  J 
i  l  i 


(9.15) 


Doe  to  the  Toeplitx  conjugate  ayaaetric  property  of  the  perturbation  aatrix 
it  will  be  possible  to  evolve  an  efficient  adaptive  aethod  for  inverting 
the  data  aatrix  produots  [X^X^].  The  ooaputational  ooaplexity  of  this  aatrix 
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inversion  routine  will  be  also  o(p2) .  The  details  of  this  rontiae  are  rather 
involved  and  will  be  therefore  not  given  here  dne  to  space  limitations. 
Since  the  covariance  nethod  is  the  aost  preferable  choice  for  the  adaptive 
class  of  autocorrelation  estimators,  however,  this  omission  is  not  serious  in 
any  case.  It  is  to  be  noted  that  efficient  adaptive  lattice  structured 

algorithms  may  also  be  employed  for  updating  the  AR  parameters  [46], [47]. 

RpgwtrthJarirotf  Avm  9i9lh 

In  some  applications,  it  is  possible  to  aohieve  a  degree  of  improvement 
in  the  AR  speotral  models  by  using  the  concept  of  data  time  reversal. 
Namely,  it  makes  use  of  the  observation  that  if  (x(n)]  represents  a 

wide-sense  stationary  process,  then  its  time  transposed  conjugated  image  as 
specified  by 

y(n)  ■  R(s-n)  (9.16) 

will  also  be  wide-sense  stationary  for  any  choice  of  the  shift  variable  ». 
Moreover,  the  autocorrelation  sequence  of  this  time  transposed  conjugated 
iawge  is  readily  found  to  be  identical  to  that  of  the  original  time  series, 
that  is 

ry(n)  «  rx(n)  (9.17) 

It  is  now  possible  to  use  this  time  transpose  property  to  effect  a  new 
autocorrelation  estiswtion  scheme.  In  particular,  upon  selecting  s~N-l,  the 
original  observation  set  (9.1)  is  seen  to  give  rise  to  the  following  set  of 
time  transposed  conjugated  elements 

y(n)  -  X(Ntl-n)  1  i  n  i  N  (9.18) 

If  these  time  reversed  observations  are  incorporated  into  expression  (9.4), 
it  will  be  generally  found  that  a  new  set  of  autocorrelation  estiauites  will 
result.  In  particular,  the  overall  backward  autocorrelation  matrix  estisute 
will  take  the  form'. 

*  -  _ - _  IfllN  (9.19) 

N+k2-ki 


in  which  the  elements  of  the  (N+k2-kj) x (p+1 )  matrix  Tjj  are  given  by 
YN(i,j)  -  KN+l-ki-i+j)  1  i  i  i  N+k2-k2 

1  i  j  i  P+1 


(9.20) 


Although  the  forward  and  haohward  antooorrelation  matrix  estimates  (9.5) 
and  (9.19)  will  be  generally  different  (except  for  the  autocorrelation  choice 
kj  •  1,  kj  »  P+1)*  they  are  each  seeking  to  estimate  the  same  underlying 
autocorrelation  matrix  R.  It  then  follows  the  so-ealled  forward-backward 
eatimate  aa  specified  by 

*  -  _ 1 (9-2D 

2(N+k2-ki) 


will  provide  an  additional  improvement  in  antooorrelation  estimation 
fidelity.  Thia  is  due  to  the  fact  that  each  of  the  entities  Z$X|i|  and  y£In 
will  contain  lag  products  not  found  in  the  other. 

The  additional  autocorrelation  estimation  fidelity  achieved  in  using 
this  time  transposition  approach  typically  results  in  a  marginal  improvement 
in  spectral  estimation  performance.  Fortunately*  this  iaprovement  is  not 
accrued  at  the  cost  of  an  excessive  increase  in  computational  complexity. 
This  is  due  to  the  fact  that  the  matrices  I*  and  T'  which  form  R  are  Toeplitx 
type.  It  is  therefore  possible  to  devise  efficient  algorithms  that  will 
solve  the  system  of  equations 

Ulfa  +  -  a  91  ^9.22) 

in  which  the  computational  complexity  is  o(p2) . 

AR  Model  Order  Determination 

One  of  the  principal  considerations  in  obtaining  AR  models  from  raw  time 
series  observations  is  that  of  model  order  selection.  It  has  been  observed 
that  when  p  is  selected  too  low*  there  will  be  generally  too  few  model  poles 
to  adequately  represent  the  underlying  spectrum.  On  the  other  hand*  too  high 
of  a  choice  for  p  will  typically  result  in  spurious  effects  (e.g.*  false 
peaks)  in  the  spectral  estimate.  With  these  thoughts  in  mind*  investigators 
have  proposed  various  order  selection  procedures.  Three  of  the  more  widely 
used  techniques  are  Akiake's  final  prediction  error  method  as  well  as  his 
information  criterion  [1]*[2],[4],  and*  Parzen's  'criterion  autoregressive 
transfer'  function  [34].  Although  these  procedures  typically  work  well*  they 
can  yield  unsatisfactory  performance  in  some  oases  (e.g.*  see  ref.  [34]  and 
[$2]).  The  user  is  therefore  cautioned  to  use  discretion  in  applying  the 
above  and  other  model  order  determination  procedures.  The  method  to  be 
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ultimately  used  should  be  determined  through  empirical  experimentation  based 
on  time  series  related  to  the  specific  application  under  consideration. 

It  is  possible  to  apply  a  conceptionally  straightforward  procedure  for 
model  order  seleotion  which  does  not  possess  many  of  the  drawbacks  alluded  to 
abowe.  It  is  based  on  the  observation  that  in  the  case  where  perfect  AR 
autocorrelation  lag  values  are  given,  the  (p+l)x(p+l)  autocorrelation  matrix 
K  with  elements 

R(i.j)  -  rx(i-j)  li  i.j  i  p+1  (9.23) 

will  have  rank  p+1  so  long  as  p  is  less  than  or  equal  to  the  order  of  the 
underlying  AR  process  (hereafter  taken  to  be  pj) .  For  all  values  of  p 
greater  than  p^,  however,  the  rank  of  the  (p+l)x(p+l)  autocorrelation  matrix 
will  be  p£.  Thus,  to  determine  the  proper  rank  selection  in  the  idealistic 
case  of  perfect  autocorrelation  lag  information,  we  simply  increase  the 
parsmeter  p  until  the  rank  of  R  is  less  than  full  (i.e..  less  than  p+1). 
This  will  occur  at  p  ■  pj+1,  thereby  giving  us  the  appropriate  order 
selection.  It  should  be  noted  that  when  the  autocorrelation  lags  being  used 
don't  correspond  to  an  AR  process,  then  the  matrix  R  will  be  generally  of 
full  rank  for  all  p  2  1. 

In  the  more  realistio  case  in  which  raw  time  series  are  used  to  form  the 

a 

autocorrelation  matrix  estiswte  R.  the  rank  of  this  matrix  will  be  typically 
full  for  all  values  of  p.  This  will  be  true  even  when  the  time  series  is  an 
AR  process.  This  seeming  contradiction  arises  due  to  statistical  errors 
inherent  in  any  autocorrelation  lag  estisution  procedure  that  might  be  used 

A 

in  forming  R.  Nonetheless,  even  though  R  will  have  full  rank,  it  will  be 
generally  found  that  when  p  >  p^,  this  matrix  will  have  (p~pj)  of  its 
eigenvalues  'close*  to  zero.  Thus,  an  order  seleotion  procedure  which  has 
provided  satisfactory  performance  is  one  entailing  examination  of  the 

A 

eigenvalue  behaviour  of  the  autocorrelation  matrix  estimate  R  as  a  funotion 
of  p.  The  appropriate  order  choice  will  be  that  value  of  p,  denoted  by  pj, 
for  which  R  has  (p-pj)  of  its  eigenvalues  sufficiently  close  to  zero  for  all 
p  >  pi.  A  particularly  attractive  method  (i.e.,  the  SVD  method)  for 
implementing  this  procedure  was  given  in  Section  X. 


78 


X.  ARMA  Modeling:  Adaptive  Implementation 

When  an  adaptive  iapleaentation  of  the  ARMA  aodeling  Methods  aa 
described  in  Sections  VI  and  VII  is  desired,  it  will  be  necessary  to 
incorporate  autocorrelation  lag  estiaate  procedures  vhioh  are  eoapatible  with 
an  adaptive  iapleaentation  (the  unbiased  estimator  is  not  ooapatible).  In 
particular,  ve  shall  now  examine  a  class  of  estiaators  which  provides  an 
adaptive  aechanisa  for  estiaating  the  eleaents  of  the  autocorrelation  Matrix 

Ml  as  required  in  expression  (6.6).  Ihis  class  of  adaptive  estiaators  will 
be  governed  by  the  relationship 

N-q-2+k2 

EjU.j)  -  1  5  liiit  (10.1) 

N+kl-kj-O-l  ^  liJirn 

It  is  apparent  that  this  expression  provides  an  estiaate  for  the  lag  eleaent 
rx(q+l+i-j)  which  is  the  (i,j)*M  eleaent  of  the  autocorrelation  matrix  Ri  as 
defined  in  equation  (6.2).  The  fixed  constants  k2  and  k2  which  characterize 
this  estiaator  are  noraally  selected  so  that  the  number  of  lag  products  there 
used  (i.e. ,  N+k2-k2-q-i)  equals  or  exceeds  p+1.  This  choice  will  generally 

A  A 

ensure  the  invertibility  of  matrix  RjfRi  and  thereby  a  unique  solution  for 
the  autoregressive  parameter  when  using  expression  (6.6).  For  reasons  which 
will  be  shortly  made  apparent,  these  constants  are  usually  further 
constrained  to  satisfy  1  £  k2  £  t  and  1  £  k2  £  P+1  although  other  choices  are 
possible. 

Eaoh  autocorrelation  estiaator  in  the  adaptive  class  (10.1)  will  be 
identified  by  a  particular  choioe  of  the  two-tuple  (kj,k2).  Moreover,  each 
estiaator  in  this  class  will  provide  a  generally  different  set  of 
autocorrelation  lag  estiaates  from  the  set  of  tiae  series  observations  x(n) 
for  1  £  n  £  N.  Clearly,  our  ultiaate  desire  is  to  select  that  estiaator 
which  generally  provides  the  best  ARMA  aodeling.  The  covariance  estiaator  as 
identified  by  k2-t  and  k2"l  furnishes  an  obvious  choice.  Before  treating 
specific  estimators,  however,  let  us  first  examine  the  general  adaptive 
estiaator  (10.1). 

The  primary  reason  as  to  why  the  adaptive  estimator  (10.1)  lends  itself 
to  an  adaptive  iapleaentation  is  due  to  the  algebraio  structure  iaplioitly 
contained  within  its  definition.  Naaely,  the  autocorrelation  matrix  estimate 
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as  foraed  froa  the  entries  (10.1)  aay  be  always  representable  in  the 
convenient  aatriz  prodnet  foraat 


A 

*1 


(10.2) 


in  whieh  the  (N+kj-k  j-q-1 ) x (p+1 )  data  aatriz  %  has  its  eleaents  epeeified  by 
%(i,j)  *  z(kj+q*l+i-j)  1  £  i  £  H+k2-k1-q-l  (10.3) 

liJiP+1 

while  the  (N+ky-kj-f-Dit  data  aatriz  I)|  has  eleaents 

YN(i,j)  -  z(ki  +  i-j)  1  £  i  iN+k2-krrl  (10.4) 

1  £  j  £  t 

A  siaple  aatriz  aanipnlation  will  prows  the  equivalencies  of  ezpressions 
(10.1)  and  (10.2) .  Ve  again  adopt  the  convention  of  setting  to  zero  any 
eleaent  entries  of  %  or  Yjj  for  which  z(n)  lies  ontside  the  observation 
interval  1  £  n  £  N,  and,  ve  also  sttaoh  the  snbsoript  N  to  these  data 
aatriees  so  as  to  ezplieitly  recognize  there  dependency  on  data  length. 

As  in  the  AR  modeling  case,  the  parsaeters  k*  and  k2  that  identify  the 
autocorrelation  estiaator  (10.1)  can  give  rise  to  data  windowing.  To  see  why 
this  is  so,  let  ns  consider  two  kernel  Toeplitz  type  natriees  whieh  eontain, 
as  snbaa trices,  all  of  the  data  aatriees  associated  with  the  adaptive  class 
of  ASHA  autocorrelation  estiaator s.  These  kernel  aatriees  are  specified  by 


L  - 


k2«l 

4 


kj-p+l 


z(q+2)  .  .  .  z(*-p+2) 


z(t+q+l)  .  .  .  z(t+q-p+l 


z(N) 


z(N-p) 


O 


z(N) 


Kn 


*(1) 


O 


z(t)  .  .  .  z(l) 


kl“l 


r 


previndoving 

i 

kx-t 


(10.5) 


z(N-q-l)  .  .  x(N-q-t) 


z(N+p-q-l).  .  z(N+p-q-t) 


Upon  examination  of  expressions  (10.3)  and  (10.4).  it  is  readily  established 
that  tha  data  matrix  %  (or  Y^)  nay  ba  identified  with  that  submatrix  of  tha 
kernel  aatrix  X n  (ox  composed  of  ita  kj**  through  (N-q-2+k2)#t  rows 

inclusively.  Thus,  c or r a spending  to  aaoh  adaptive  autoeoxralatiou  astiuator 
(i.e*.  ehoiea  of  pair  (ki.k2))»  thara  will  ba  aa  assooiatad  pair  of  data 
aatrioes  obtaiaad  by  uaiag  this  row  idaatifieatioa  aohaaa. 

Tha  zaroaa  whioh  appear  ia  tha  upper  right  ooraar  of  kernel  aatrix  H  N 
are  thara  due  to  tha  implicit  prewindow  aaauaptioa  that  x(a)  ■  0  for  2-t 
This  uaraalietio  rastrictioa  oa  aa  uaobaarred  aagaaat  of  tha  tiaa  aariaa  ia 
to  ba  normally  avoided.  It  is  to  ba  noted  froa  tha  rapraaaatatioa  for  % 
that  a  aalaotioa  of  ki)t  vill  avoid  any  data  prewiadowing.  On  tha  other 
hand,  a  degree  of  prewindoving  ia  incorporated  whenever  kj  ia  such  that  1 
Iki  <t-l .  Thus,  as  kj  ranges  over  tha  iatagars  1  to  t,  tha  amount  of 
prewiadowing  varies  froa  full  at  k^“l  to  none  at  k^  ■  t. 

Ia  a  siailar  fashion,  tha  xaroaa  which  arise  ia  tha  lower  left  ooraar  of 
kernel  matrix  X^  are  thara  due  to  tha  implicit  postwindow  aasmption  that 
x(a)  ■  0  for  N+l£n£l+p.  This  contrived  assumption  oa  aa  unobserved  segment 
of  tha  time  series  is  also  to  be  avoided.  Upon  axaaiaatioa  of  tha  kernel 
aatrix  it  ia  apparent  that  poatwiadowing  nay  be  avoided  by  aeleetiag  k2  i 
1.  It  ia  also  dear  that  tha  degree  of  poatwiadowing  varies  from  none  at 
k2«l  to  full  at  k2“p+l . 

The  four  most  appealing  ehoioes  for  adaptive  estimators  are  identified 
ia  Table  10.1  ia  which  it  is  noted  that  each  involves  combinations  of  maximum 
windowing  and  no  windowing.  The  covariance  method  entails  that  particular 
combination  of  no  prewiadowing  (i.a..  k*  ■  t)  and  no  poatwiadowing  (i.a., 
k2«l) .  This  method  ia  seen  to  provide  the  laraeat  number  of  lag  produota 
(i.a.,  N-q-t)  ia  estimator  (10.1)  for  which  no  data  windowing  ia  involved. 
Aa  night  be  expected,  the  oovarianoe  method  typically  provides  the  beet 
modeling  performance  from  the  AKKA  adaptive  class  of  autocorrelation 
estimators. 

The  three  other  methods  listed  in  Table  10.1  are  seen  to  employ  either 
full  prewiadowing.  full  poatwiadowing.  or.  both.  It  is  clear  that  the 
modeling  performance  capabilities  of  each  of  these  three  methods  will  tend  to 
be  relatively  poor  when  the  data  length  N  is  only  marginally  larger  than  the 
ASKA  order  parameter  p  or  the  parameter  t.  On  the  other  hand,  for  the  ease 
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in  *Uek  N  is  mueh  largsr  than  either  p  or  t(  eaeh  of  the  uthoda  liated  in 
Table  10.1  will  provide  comparable  node ling  perf ornance.  This  ia  a 
oonssquenee  of  tbe  faet  that  the  windowed  portions  of  the  data  matrices  g 
and  k  play  a  proprtionately  smaller  role  ia  tbe  estimate  of  aa  N 
increases.  In  any  ease*  unless  special  considerations  dictate  otherwise*  the 
covariance  method  is  the  most  preferable  choioe  for  an  adaptive 
impl  amenta  ti  on. 


METHOD  CONSTANT  CONSTANT  STATISTICAL 

k2  k2  PROPERTIES 

OF  Bn 

1.  Covariance 

(No  windowing) 

t 

1 

( i)  unbiased 

(ii)  consistent 

2.  Full  Prewindowing 

No  Postwindowing 

1 

1 

(i)  biased 

(ii)  consistent 

3.  Full  Postwindowing 
No  Prewindowing 

t 

p+1 

(i)  biased 

(ii)  consistent 

4.  Aatooorrelation 
(Full  pre  and 
postwindowing) 

1 

p+1 

(i)  biased 

(ii)  consistent 

(iii)  Toeplitx 

Table  10.1:  Four  MIMA  Adaptive  Aatooorrelation 
Estimator  Methods 

In  order  to  provide  the  reason  as  to  why  members  of  the  adaptive  class 
of  antocorrelation  estimators  are  ameanable  to  an  adaptive  implementation* 
let  as  substitute  the  matrix  product  representation  for  Bj  as  given  by 
expression  (10.2)  into  the  basio  JEMA  modeling  equation  (6.6) .  The  resultant 
autoregressive  parameter  vector  is  then  obtained  by  solving  the  normal 
equations 

JN  “  •  H  <!*•«> 

where  the  weighting  matrix  T  has  been  set  equal  to  the  identity  matrix  while 
the  normalising  constant  a  is  selected  so  that  the  first  component  of  ig  is 
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on*.  Fran  this  expression  it  is  apparent  that  the  data  Matrix  orodnnt 
completely  identifies  the  AKXA  nodel  '  s  autoregressive  pa  rase  ter 
Tea  tor.  In  order  to  compute  it  will  be  then  necessary  to  oonpnte  the 
data  natrix  product's  inverse  at  each  value  of  N  where  the  autoregressiwe 
parameter  vector  is  required.  This  oan  be  a  particularly  imposing 
coaputational  task  if  real  tine  signal  processing  is  to  be 
arehiered. 

Adaptive  Algorithm*  kj -  1 

When  the  autoregressive  parsaeter  vector  is  required  at  each  tine  index 
N,  it  will  be  benefieial  to  effect  an  adaptive  method  for  updating  the  data 
■atrix  produet  in  expression  (10.6) .  This  adaptive  implementation  ia  readily 
achieved  for  the  nonpostwindowing  ease  kj»l  in  whioh  k*  may  on  any 

appropriate  value  (e.g.«  liki<t) .  Namely,  upon  examination  of  expression 
(10.5)  with  k2*l.  it  is  seen  that  the  data  matrices  1||+1  and  lg+i  are 
obtained  by  appending  appropriate  row  vectors  to  the  bottom  of  data  ma trice a 
Ag  and  lg.  respectively.  Using  this  property,  the  following  recursion  on  the 
data  matrix  product  is  obtained 

*8+1  *f+l  ■  *8%  +  2N  Hit  (10.7a) 

where  jg  and  yji  are  the  above  mentioned  lx(p+l)  and  lxt  row  vectors,  that  are 
appended  to  Ig  Mg  respectively.  These  vectors  are  specified  by 

-  Ix(N+l).  x(N) .  ....  x(N+l-p)]  (10.7b) 

7N  -  Ix(N-q).  x(N-q-l),  ....  x(N+l-q-t)l  (10.7o) 

It  is  to  be  noted  that  recursive  expression  (10.7a)  holds  only  fox  time 
indices  N^t  since  t  is  the  first  time  index  where  1^  takes  on  its  full 
algebraic  form.  With  this  ia  mind.  l{zt  then  serves  the  role  of  an 
initialising  matrix  for  this  recursion.  Although  the  perturbation  matrix 
Zn*h  in  this  recursion  does  not  depend  on  the  parameter  kj,  the  initialising 
matrix  YfXt  does.  As  such,  the  sequence  of  matrix  products  as  generated  by 
expression  (10.7s)  will  be  different  for  various  choices  of  k^. 

The  full  data  matrix  product  as  required  in  expression  (10.6)  may  be 
readily  obtained  from  relationship  (10.7)  and  takes  on  the  following 
recursive  form 

38+iin+ii8+iin+i  ■  ♦  M&i  *  *n*n  +  (ZnZn)  J&n 

Hit  (lO.ta) 

where  gg  is  the  lx(p+l)  vector  given  by 

"  Zn 


(lO.tb) 


w. 


An  imiattioa  of  this  recursive  expression  indicates  that  t(p+l)  oporstioas 
irt  required  to  caapute  vhilo  snothsr  2(p+l)a  oporstioas  sro  ozpoadod  ia 
updating  tho  fall  astriz  prodnot  (10.8s)-.  Ia  arriving  st  this  ooapmtstioasl 
roqniroaoat  aeasure,  it  hss  boon  tacitly  asssaod  that  tho  astriz  prodaots 

sad  are  available. 

Whoa  apdstos  of  the  satoregressiTe  paraaeter  too tors  aro  aot  roqsirod 
at  oaeh  tiao  iadoz  N,  wo  ooald  then  asa  roearsioas  (10.7)  sad  (10.8)  to 
eoapate  the  data  astriz  prodaots  and  in  a  coaputationally 

offieioat  aaanor.  At  those  tiae  instants  at  whieh  the  oral  nation  of  jg  is 
required,  we  would  then  siaply  solwe  the  ASHA  node ling  eqaatioas  (10.6) .  If 
standard  procedures  are  used,  this  solution  will  entail  on  the  order  of 
(p+l)3  eoapate tioasv 

In  warioas  applications,  how  or  or,  it  nay  be  necessary  to  eoapate  the 
autoregressive  parsaeter  vector  at  eaeh  (or  nearly  each)  value  of  tiae  N. 
For  such  oases,  it  would  be  aueh  aore  advantageous  to  replace  the  recursion 
(10.8)  by  a  recursion  for  the  inverse  astriz  product  To  of  feet 

this  recursion,  wc  note  froa  relationship  (10.8)  that  the  astriz  prodaots 
at  two  contiguous  tiae  indices  (i.e.»  N  and  N+l)  differ  by  the  sua 
of  three  rank  one  aatriocsv  Using  this  fact,  it  is  then  possible  to  apply 
Leaas  9.1  successively  three  tiaes  to  effect  the  desired  astriz  product 
inverse  recursion.  The  aain  steps  of  this  recursive  inversion  are  listed  ia 
Table  10.2.  It  is  iaportant  to  note  that  this  recursion  ccaaenoes  at  N  - 
q+p+kj+1  which  corresponds  to  the  first  tine  instant  at  which  the  astriz 
produot  is  generally  invertible-.  Steps  3  through  6  provide  the 

aeehaaisa  for  this  astriz  product  inversion  while  step  7  gives  the  required 
solution  to  the  ABMA  aodeling  equations  (10.6).  In  tern  of  ecaputational 
coaplczity,  it  is  readily  shown  that  the  nabber  of  aultiplieation  and 
addition  operations  required  to  iaplsaent  this  algoritha  is  of  order  p(t+3p) 
for  each  data  point  update. 

The  adaptive  algoritha  described  in  Table  10.2  is  for  the  particular 
uonpo stwindowing  selection  kj»l  wherein  the  paraaeter  kj  will  be  typieally 
selected  to  satisfy  1  i  kj  ^  t.  As  suggested  earlier,  the  covariance  aethod 
identified  by  kj»t  and  kj"l  generally  provides  the  best  overall  aodeling 
perforaanee  for  the  class  of  adaptive  estiaatore.  Ve  nay  therefore  use  the 
adaptive  AIMA  aodeling  aethod  to  provide  an  efficient  procedure  for 
recursively  iapleaenting  the  desirable  covariance  aethod.  As  a  final  note. 
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althoagh  it  is  possible  to  offset  adoptive  iapleaeatatioas  for  other  okoioes 
°*  ki  (i.e. ,  kifl),  the  resol  teat  algorithm  is  of  e  awk  sore  eosiples  sstne, 
Siaee  the  oorsrieaee  aethod  is  elaost  invariably  wsed,  however,  we  shall  be 
ooateat  with  the  aoapostwiadowiag  algoritha. 

It  is  also  possible  to  provide  a  lattice  iapleaeatatioa  of  the  adaptive 
algoritha  here  developed  [19]  aad  [49]  •  This  will  eatail  restricting  t  ■  p 
thereby  imparting  a  deerease  in  speotral  estiaatioa  perf oraanoe.  The 
advantage  aeeraed  by  nsing  the  lattice  iapleaeatatioa  is  ooapatatioaal  in 
as tore.  Ia  particular,  the  aoaber  of  operations  to  apdate  the  lattioe 
network  is  o(p  ap)  for  each  new  tiae  series  observation. 


ts 


STEP  0  IE*  input  to  sohmsi  «] it  algoritha  at  N  ■  q+p+kj+1  it 

^  m4 


STEP  1 

N  »  q+p+kj+1 

STEP  2 

Coaputs  %+J  using  txprsssion  (10.7) 

EB 

• 

*1  “  7Ji 

STEP  4 

H  ■  AN*  *1  “  AN*  Al-1  ■ 

Coaputs  [Aj  +  xj]"1  »*i*g  Lata  9.1 

STEP  5 

E2  “  2j»  2j  “  %'  hT1  "  IAj  +  a}  Ill-1 

Coapnt •  [A^  +  JJ2  Ij]-1  using  Lsaas  9.1 

STEP  6 

S3  -  (%yj>  %>  Is  -  %•  As"1  -  tAj  +  j|  Hi”1 

Coaputs  (A3  ♦  13I”1  “  t%+lIN+lIK+l%+l^”1  u,in* 

Lanas  9.1 

STEP  7 

S  "  I  xN+fTN+lTN+JIN+l  1  -14l 

JH+1  “  o(l)_li  whsrs  o(l)  is  th*  first  ooaponsnt  of  s. 

STEP  8 

TABLE  10.2:  Adapt  It  •  Algoritha  foe  Computing 
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XII.  Conclusions 

A  philosophy  dirsoted  towards  the  rational  nods  ling  of  vido*-aonaa 
stationary  tins  series  has  been  presented.  It  is  explicitly  baaed  npon  the 
Till  e-Walker  equations  whieh  characterize  the  autocorrelation  sequence 
aasoeiated  with  the  rational  tine  aeries  being  aodeled.  In  particular,  the 
key  concept  is  that  of  using  an  overdeterainad  set  of  Yule-Walker  equation 
evaluations  for  estiaating  the  paraaeters  of  a  postulated  rational  aodel. 
This  approach  has  been  found  to  reduce  the  data  induced  hypersensitivity  of 
the  paraaeter  estiaates  in  eoaparison  to  aany  of  the  aore  popular  paraaetrio 
approaches  which  invoke  a  aiaiaal  set  of  evaluations  for  obtaining  the 
paraaeter  estiaates.  Xhese  latter  aethods  include  the  Burg  algoritha,  aany 
LMS  aethods,  and  the  one-step  predictor,  Coaparative  ezaaples  illustrating 
this  reduced  hypersensitivity  have  been  given  in  whieh  the  aodel ing  is  based 
on  both  exact  autocorrelation  lag  infoxaation,  and,  raw  tiae  series 
observations. 

The  aethod  of  singular  value  deeoapoaition  was  next  introduced  and  was 
used  to  obtain  an  effective  rational  aodel  order  detexaination  procedure  as 
well  as  providing  a  novel  rational  node ling  procedure  whose  perforaanoe  has 
been  eapirically  found  to  often  exceed  that  of  existing  techniques.  Studies 
are  currently  under  way  to  aore  effectively  use  this  SVD  adaption  for 
achieving  yet  further  perforaanoe  iaproveaents. 
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